HTCondorService¶
- class lsst.ctrl.bps.htcondor.HTCondorService(config)¶
Bases:
BaseWmsService
HTCondor version of WMS service.
Methods Summary
cancel
(wms_id[, pass_thru])Cancel submitted workflows/jobs.
list_submitted_jobs
([wms_id, user, ...])Query WMS for list of submitted WMS workflows/jobs.
ping
(pass_thru)Check whether WMS services are up, reachable, and can authenticate if authentication is required.
prepare
(config, generic_workflow[, out_prefix])Convert generic workflow to an HTCondor DAG ready for submission.
report
([wms_workflow_id, user, hist, ...])Return run information based upon given constraints.
restart
(wms_workflow_id)Restart a failed DAGMan workflow.
Check to run at start if running WMS specific submission steps.
submit
(workflow, **kwargs)Submit a single HTCondor workflow.
Methods Documentation
- cancel(wms_id, pass_thru=None)¶
Cancel submitted workflows/jobs.
- Parameters:
- Returns:
- list_submitted_jobs(wms_id=None, user=None, require_bps=True, pass_thru=None, is_global=False)¶
Query WMS for list of submitted WMS workflows/jobs.
This should be a quick lookup function to create list of jobs for other functions.
- Parameters:
- wms_id
int
orstr
, optional Id or path that can be used by WMS service to look up job.
- user
str
, optional User whose submitted jobs should be listed.
- require_bps
bool
, optional Whether to require jobs returned in list to be bps-submitted jobs.
- pass_thru
str
, optional Information to pass through to WMS.
- is_global
bool
, optional If set, all job queues (and their histories) will be queried for job information. Defaults to False which means that only the local job queue will be queried.
- wms_id
- Returns:
- job_ids
list
[Any
] Only job ids to be used by cancel and other functions. Typically this means top-level jobs (i.e., not children jobs).
- job_ids
- ping(pass_thru)¶
Check whether WMS services are up, reachable, and can authenticate if authentication is required.
The services to be checked are those needed for submit, report, cancel, restart, but ping cannot guarantee whether jobs would actually run successfully.
- prepare(config, generic_workflow, out_prefix=None)¶
Convert generic workflow to an HTCondor DAG ready for submission.
- Parameters:
- config
lsst.ctrl.bps.BpsConfig
BPS configuration that includes necessary submit/runtime information.
- generic_workflow
lsst.ctrl.bps.GenericWorkflow
The generic workflow (e.g., has executable name and arguments).
- out_prefix
str
The root directory into which all WMS-specific files are written.
- config
- Returns:
- workflow
lsst.ctrl.bps.wms.htcondor.HTCondorWorkflow
HTCondor workflow ready to be run.
- workflow
- report(wms_workflow_id=None, user=None, hist=0, pass_thru=None, is_global=False, return_exit_codes=False)¶
Return run information based upon given constraints.
- Parameters:
- wms_workflow_id
str
, optional Limit to specific run based on id.
- user
str
, optional Limit results to runs for this user.
- hist
float
, optional Limit history search to this many days. Defaults to 0.
- pass_thru
str
, optional Constraints to pass through to HTCondor.
- is_global
bool
, optional If set, all job queues (and their histories) will be queried for job information. Defaults to False which means that only the local job queue will be queried.
- return_exit_codes
bool
, optional If set, return exit codes related to jobs with a non-success status. Defaults to False, which means that only the summary state is returned.
Only applicable in the context of a WMS with associated handlers to return exit codes from jobs.
- wms_workflow_id
- Returns:
- runs
list
[lsst.ctrl.bps.WmsRunReport
] Information about runs from given job information.
- message
str
Extra message for report command to print. This could be pointers to documentation or to WMS specific commands.
- runs
- restart(wms_workflow_id)¶
Restart a failed DAGMan workflow.
- Parameters:
- wms_workflow_id
str
The directory with HTCondor files.
- wms_workflow_id
- Returns:
- run_id
str
HTCondor id of the restarted DAGMan job. If restart failed, it will be set to None.
- run_name
str
Name of the restarted workflow. If restart failed, it will be set to None.
- message
str
A message describing any issues encountered during the restart. If there were no issues, an empty string is returned.
- run_id
- run_submission_checks()¶
Check to run at start if running WMS specific submission steps.
Any exception other than NotImplementedError will halt submission. Submit directory may not yet exist when this is called.