This page will serve as my brain dump of all things PrivSep . My goal is to highlight things that aren't obvious from looking at the code. In addition, I'm neither dying nor moving farther than 0.2 miles from the HTCondor project's home in the CS building so please feel free to let me know if something is glaringly missing from this page.
PrivSep is currently implemented for HTCondor's execute side. The central manager daemons have always been able to run fine without root. The submit side daemons must still run as root unless there is only a single submitter or all submitters are trusted (i.e. a personal HTCondor).
For a single HTCondor instance to have both (multi-user) submit-side and execute-side functionality either PrivSep must not be used or the Master must still run as root and be configured to start the StartD without root (via the STARTD_USERID setting).
One other quirk is that PrivSep requires HTCondor's EXECUTE directory to be "trusted" according to Jim Kupsch's trusted path algorithm. This means that the EXECUTE directory and all its parent directories must be root-owned and only writable by root. It's a reasonable requirement but it differs from how HTCondor normally works so it's something to watch out for when moving from non-PrivSep to PrivSep or vice versa. Other files that must be trusted in this same sense include the Switchboard, it's config file (/etc/condor/privsep_config), and the ProcD.
PrivSep's architecture is pretty simple. Only a small portion of the actual HTCondor code requires root privileges. For now, that includes the condor_root_switchboard and condor_procd programs. (See #110 on how this could change to only include the Switchboard.) The rest of execute-side HTCondor runs as an unprivileged user and makes requests of the root-enabled components when necessary. It is the responsibility of the root-enabled pieces to only honor "valid" requests from other HTCondor daemons. The policy as to what is valid is determined partly by the Switchboard's configuration file, which lists users and directories over which HTCondor should be allowed to operate.
The ProcD provides operations relevant to tracking, monitoring, and controlling processes and is described in more detail on the ProcdWisdom page.
The Switchboard is a root-owned setuid program providing the following operations:
- Start a ProcD as root
- Execute a program (job) as a given user
- Make a directory (needed since the EXECUTE directory is only writable by root)
- Recursively chown a directory from/to a given user to/from the HTCondor user
- Recursively remove a directory
In our execute-side PrivSep implementation, most interactions with the root-enabled portion of HTCondor happen from the Starter. The Master interacts only with the ProcD to manage the process families of its children. The StartD does the same, and additionally uses the Switchboard's recursive remove operation to keep the EXECUTE directory clean.
Everything else happens from the Starter. The workflow looks like this:
- The Switchboard is used to make the dir_<pid> directory that will be used as the job's sandbox. Unlike non-PrivSep mode, this directory will start out owned by the HTCondor user.
- HTCondor does everything needed to prepare the job for execution (most notably doing file transfer into the sandbox directory).
- Before starting the job, the Switchboard is used to chown the sandbox to the user.
- The Switchboard is used to spawn the job as the right UID.
- While the job is running, the ProcD is used as usual for process monitoring and control.
- When the job is complete, the sandbox is chowned back to the HTCondor user.
- Files can now be transferred out.
- Before the Starter exits, the Switchboard is used to clean up the execute directory.
That's pretty much it. One slightly different case concerns VM universe. In that case it is the VMGahp that does the chowns before the job starts and after it completes. The reason is that the VMGahp does its own mucking around with the job sandbox before and after the job runs and the sandbox must be owned by the HTCondor UID for this to succeed. VM universe also allows for checkpoints which require still more chowning. When a checkpoint happens the VMGahp suspends the VM, the Starter chowns the sandbox to the HTCondor user, file transfer moves the checkpoint files to the submit machine, the Starter chowns the sandbox back to the user, and finally the VMGahp resumes the job.