Hints for using HTCondor’s credd and condor_store_cred

HTCondor has the ability to run jobs as either an unprivileged “nobody” user or as the submitting user. On Linux, enabling this is fairly easy: the administrator just sets the UID_DOMAIN configuration to the same value and away you go. On Windows, you need to run the credential daemon (condor_credd) and the user must send store credentials using condor_store_cred.

The manual does a pretty good job of describing the basic setup of the credd, though there are some important pieces missing. With help from HTCondor technical lead Todd Tannenbaum, I’ve submitted some improvements to the docs, but in the meantime…

The main thing to consider when configuring your pool to use the credd is that it wants things to be secure. That makes sense, considering its entire job is to securely store and transfer user credentials. The credd will not hand out the password unless the client is authenticated and using a secure connection. The method of authentication is not important (if you really, really trust your network, you can use the CLAIMTOBE method), so long as authentication occurs somehow.

So where do the condor_store_cred hints come in? Often, the credd runs on the same machine as the schedd, and users log in to there to submit jobs. In that case, everything’s probably fine. But if you’re submitting jobs from a machine outside the pool (for example, a user’s workstation), it can get a little hairier.

Before running condor_store_cred, HTCondor needs to be told where to look for the credd, and the client settings mentioned above need to meet the credd’s requirements. (I’m using CLAIMTOBE here for simplicity). If the machine the user submits from is not in the pool, condor_store_cred will need to know where to find the collector, too.

CREDD_HOST = scheduler.example.com
COLLECTOR_HOST = centralmanager.example.com

As of this writing, condor_store_cred gives an unhelpful error message if something goes wrong. It will always say “Make sure your ALLOW_WRITE setting includes this host.”, so if your ALLOW_WRITE setting already includes the host in question, you might get stuck. Use the -debug option to get better output. For example:

02/16/16 12:23:51 STORE_CRED: In mode 'query'
02/16/16 12:23:51 Warning: Collector information was not found in the configuration file. ClassAds will not be sent to the collector and this daemon will not join a larger Condor pool.
02/16/16 12:23:51 STORE_CRED: Failed to start command.
02/16/16 12:23:51 STORE_CRED: Unable to contact the REMOTE schedd.

This tells you that you forgot to set the COLLECTOR_HOST in your configuration.

Another hint is that if your scheduler name is different than the machine name (e.g. if you run multiple condor_schedd processes on a single machine and have Q1@hostname, Q2@hostname, etc), you might need to include “-name Q1@hostname” in the arguments. Unlike most other HTCondor client commands, you cannot specify a “sinful string” as a target using the “-addr” option.

Hopefully this helps you save a little bit of time getting run_as_owner working on your Windows pool, until such time as I sit down to write that “Administering HTCondor” book that I’ve been meaning to work on for the last 5 years.