PreExecInit

class lsst.ctrl.mpexec.PreExecInit(butler, taskFactory, extendRun=False)

Bases: object

Initialization of registry for QuantumGraph execution.

This class encapsulates all necessary operations that have to be performed on butler and registry to prepare them for QuantumGraph execution.

Parameters:
butler : Butler

Data butler instance.

taskFactory : TaskFactory

Task factory.

extendRun : bool, optional

If True then do not try to overwrite any datasets that might exist in butler.run; instead compare them when appropriate/possible. If False, then any existing conflicting dataset will cause a butler exception to be raised.

Methods Summary

initialize(graph[, saveInitOutputs, …]) Perform all initialization steps.
initializeDatasetTypes(graph[, …]) Save or check DatasetTypes output by the tasks in a graph.
saveConfigs(graph) Write configurations for pipeline tasks to butler or check that existing configurations are equal to the new ones.
saveInitOutputs(graph) Write any datasets produced by initializing tasks in a graph.
savePackageVersions(graph) Write versions of software packages to butler.

Methods Documentation

initialize(graph, saveInitOutputs=True, registerDatasetTypes=False, saveVersions=True)

Perform all initialization steps.

Convenience method to execute all initialization steps. Instead of calling this method and providing all options it is also possible to call methods individually.

Parameters:
graph : QuantumGraph

Execution graph.

saveInitOutputs : bool, optional

If True (default) then save “init outputs”, configurations, and package versions to butler.

registerDatasetTypes : bool, optional

If True then register dataset types in registry, otherwise they must be already registered.

saveVersions : bool, optional

If False then do not save package versions even if saveInitOutputs is set to True.

initializeDatasetTypes(graph, registerDatasetTypes=False)

Save or check DatasetTypes output by the tasks in a graph.

Iterates over all DatasetTypes for all tasks in a graph and either tries to add them to registry or compares them to exising ones.

Parameters:
graph : QuantumGraph

Execution graph.

registerDatasetTypes : bool, optional

If True then register dataset types in registry, otherwise they must be already registered.

Raises:
ValueError

Raised if existing DatasetType is different from DatasetType in a graph.

KeyError

Raised if registerDatasetTypes is False and DatasetType does not exist in registry.

saveConfigs(graph)

Write configurations for pipeline tasks to butler or check that existing configurations are equal to the new ones.

Parameters:
graph : QuantumGraph

Execution graph.

Raises:
TypeError

Raised if extendRun is True but existing object in butler is different from new data.

Exception

Raised if extendRun is False and datasets already exists. Content of a butler collection should not be changed if exception is raised.

saveInitOutputs(graph)

Write any datasets produced by initializing tasks in a graph.

Parameters:
graph : QuantumGraph

Execution graph.

Raises:
TypeError

Raised if extendRun is True but type of existing object in butler is different from new data.

Exception

Raised if extendRun is False and datasets already exists. Content of a butler collection may be changed if exception is raised.

Notes

If extendRun is True then existing datasets are not overwritten, instead we should check that their stored object is exactly the same as what we would save at this time. Comparing arbitrary types of object is, of course, non-trivial. Current implementation only checks the existence of the datasets and their types against the types of objects produced by tasks. Ideally we would like to check that object data is identical too but presently there is no generic way to compare objects. In the future we can potentially introduce some extensible mechanism for that.

savePackageVersions(graph)

Write versions of software packages to butler.

Parameters:
graph : QuantumGraph

Execution graph.

Raises:
TypeError

Raised if extendRun is True but existing object in butler is different from new data.