HTCondorService

class lsst.ctrl.bps.htcondor.HTCondorService(config)

Bases: BaseWmsService

HTCondor version of WMS service.

Methods Summary

cancel(wms_id[, pass_thru])

Cancel submitted workflows/jobs.

list_submitted_jobs([wms_id, user, ...])

Query WMS for list of submitted WMS workflows/jobs.

ping(pass_thru)

Check whether WMS services are up, reachable, and can authenticate if authentication is required.

prepare(config, generic_workflow[, out_prefix])

Convert generic workflow to an HTCondor DAG ready for submission.

report([wms_workflow_id, user, hist, ...])

Return run information based upon given constraints.

restart(wms_workflow_id)

Restart a failed DAGMan workflow.

run_submission_checks()

Check to run at start if running WMS specific submission steps.

submit(workflow)

Submit a single HTCondor workflow.

Methods Documentation

cancel(wms_id, pass_thru=None)

Cancel submitted workflows/jobs.

Parameters:
wms_idstr

Id or path of job that should be canceled.

pass_thrustr, optional

Information to pass through to WMS.

Returns:
deletedbool

Whether successful deletion or not. Currently, if any doubt or any individual jobs not deleted, return False.

messagestr

Any message from WMS (e.g., error details).

list_submitted_jobs(wms_id=None, user=None, require_bps=True, pass_thru=None, is_global=False)

Query WMS for list of submitted WMS workflows/jobs.

This should be a quick lookup function to create list of jobs for other functions.

Parameters:
wms_idint or str, optional

Id or path that can be used by WMS service to look up job.

userstr, optional

User whose submitted jobs should be listed.

require_bpsbool, optional

Whether to require jobs returned in list to be bps-submitted jobs.

pass_thrustr, optional

Information to pass through to WMS.

is_globalbool, optional

If set, all job queues (and their histories) will be queried for job information. Defaults to False which means that only the local job queue will be queried.

Returns:
job_idslist [Any]

Only job ids to be used by cancel and other functions. Typically this means top-level jobs (i.e., not children jobs).

ping(pass_thru)

Check whether WMS services are up, reachable, and can authenticate if authentication is required.

The services to be checked are those needed for submit, report, cancel, restart, but ping cannot guarantee whether jobs would actually run successfully.

Parameters:
pass_thrustr, optional

Information to pass through to WMS.

Returns:
statusint

0 for success, non-zero for failure.

messagestr

Any message from WMS (e.g., error details).

prepare(config, generic_workflow, out_prefix=None)

Convert generic workflow to an HTCondor DAG ready for submission.

Parameters:
configlsst.ctrl.bps.BpsConfig

BPS configuration that includes necessary submit/runtime information.

generic_workflowlsst.ctrl.bps.GenericWorkflow

The generic workflow (e.g., has executable name and arguments).

out_prefixstr

The root directory into which all WMS-specific files are written.

Returns:
workflowlsst.ctrl.bps.wms.htcondor.HTCondorWorkflow

HTCondor workflow ready to be run.

report(wms_workflow_id=None, user=None, hist=0, pass_thru=None, is_global=False, return_exit_codes=False)

Return run information based upon given constraints.

Parameters:
wms_workflow_idstr, optional

Limit to specific run based on id.

userstr, optional

Limit results to runs for this user.

histfloat, optional

Limit history search to this many days. Defaults to 0.

pass_thrustr, optional

Constraints to pass through to HTCondor.

is_globalbool, optional

If set, all job queues (and their histories) will be queried for job information. Defaults to False which means that only the local job queue will be queried.

return_exit_codesbool, optional

If set, return exit codes related to jobs with a non-success status. Defaults to False, which means that only the summary state is returned.

Only applicable in the context of a WMS with associated handlers to return exit codes from jobs.

Returns:
runslist [lsst.ctrl.bps.WmsRunReport]

Information about runs from given job information.

messagestr

Extra message for report command to print. This could be pointers to documentation or to WMS specific commands.

restart(wms_workflow_id)

Restart a failed DAGMan workflow.

Parameters:
wms_workflow_idstr

The directory with HTCondor files.

Returns:
run_idstr

HTCondor id of the restarted DAGMan job. If restart failed, it will be set to None.

run_namestr

Name of the restarted workflow. If restart failed, it will be set to None.

messagestr

A message describing any issues encountered during the restart. If there were no issues, an empty string is returned.

run_submission_checks()

Check to run at start if running WMS specific submission steps.

Any exception other than NotImplementedError will halt submission. Submit directory may not yet exist when this is called.

submit(workflow)

Submit a single HTCondor workflow.

Parameters:
workflowlsst.ctrl.bps.BaseWorkflow

A single HTCondor workflow to submit. run_id is updated after successful submission to WMS.