PipelineTaskConnections

class lsst.pipe.base.PipelineTaskConnections(*, config: PipelineTaskConfig = None)

Bases: object

PipelineTaskConnections is a class used to declare desired IO when a PipelineTask is run by an activator

Parameters
configPipelineTaskConfig

A PipelineTaskConfig class instance whose class has been configured to use this PipelineTaskConnectionsClass

Notes

PipelineTaskConnection classes are created by declaring class attributes of types defined in lsst.pipe.base.connectionTypes and are listed as follows:

  • InitInput - Defines connections in a quantum graph which are used as inputs to the __init__ function of the PipelineTask corresponding to this class

  • InitOuput - Defines connections in a quantum graph which are to be persisted using a butler at the end of the __init__ function of the PipelineTask corresponding to this class. The variable name used to define this connection should be the same as an attribute name on the PipelineTask instance. E.g. if an InitOutput is declared with the name outputSchema in a PipelineTaskConnections class, then a PipelineTask instance should have an attribute self.outputSchema defined. Its value is what will be saved by the activator framework.

  • PrerequisiteInput - An input connection type that defines a lsst.daf.butler.DatasetType that must be present at execution time, but that will not be used during the course of creating the quantum graph to be executed. These most often are things produced outside the processing pipeline, such as reference catalogs.

  • Input - Input lsst.daf.butler.DatasetType objects that will be used in the run method of a PipelineTask. The name used to declare class attribute must match a function argument name in the run method of a PipelineTask. E.g. If the PipelineTaskConnections defines an Input with the name calexp, then the corresponding signature should be PipelineTask.run(calexp, ...)

  • Output - A lsst.daf.butler.DatasetType that will be produced by an execution of a PipelineTask. The name used to declare the connection must correspond to an attribute of a Struct that is returned by a PipelineTask run method. E.g. if an output connection is defined with the name measCat, then the corresponding PipelineTask.run method must return Struct(measCat=X,..) where X matches the storageClass type defined on the output connection.

The process of declaring a PipelineTaskConnection class involves parameters passed in the declaration statement.

The first parameter is dimensions which is an iterable of strings which defines the unit of processing the run method of a corresponding PipelineTask will operate on. These dimensions must match dimensions that exist in the butler registry which will be used in executing the corresponding PipelineTask.

The second parameter is labeled defaultTemplates and is conditionally optional. The name attributes of connections can be specified as python format strings, with named format arguments. If any of the name parameters on connections defined in a PipelineTaskConnections class contain a template, then a default template value must be specified in the defaultTemplates argument. This is done by passing a dictionary with keys corresponding to a template identifier, and values corresponding to the value to use as a default when formatting the string. For example if ConnectionClass.calexp.name = '{input}Coadd_calexp' then defaultTemplates = {‘input’: ‘deep’}.

Once a PipelineTaskConnections class is created, it is used in the creation of a PipelineTaskConfig. This is further documented in the documentation of PipelineTaskConfig. For the purposes of this documentation, the relevant information is that the config class allows configuration of connection names by users when running a pipeline.

Instances of a PipelineTaskConnections class are used by the pipeline task execution framework to introspect what a corresponding PipelineTask will require, and what it will produce.

Examples

>>> from lsst.pipe.base import connectionTypes as cT
>>> from lsst.pipe.base import PipelineTaskConnections
>>> from lsst.pipe.base import PipelineTaskConfig
>>> class ExampleConnections(PipelineTaskConnections,
...                          dimensions=("A", "B"),
...                          defaultTemplates={"foo": "Example"}):
...     inputConnection = cT.Input(doc="Example input",
...                                dimensions=("A", "B"),
...                                storageClass=Exposure,
...                                name="{foo}Dataset")
...     outputConnection = cT.Output(doc="Example output",
...                                  dimensions=("A", "B"),
...                                  storageClass=Exposure,
...                                  name="{foo}output")
>>> class ExampleConfig(PipelineTaskConfig,
...                     pipelineConnections=ExampleConnections):
...    pass
>>> config = ExampleConfig()
>>> config.connections.foo = Modified
>>> config.connections.outputConnection = "TotallyDifferent"
>>> connections = ExampleConnections(config=config)
>>> assert(connections.inputConnection.name == "ModifiedDataset")
>>> assert(connections.outputConnection.name == "TotallyDifferent")

Attributes Summary

allConnections

initInputs

initOutputs

inputs

outputs

prerequisiteInputs

Methods Summary

adjustQuantum(datasetRefMap)

Override to make adjustments to lsst.daf.butler.DatasetRef objects in the lsst.daf.butler.core.Quantum during the graph generation stage of the activator.

buildDatasetRefs(quantum)

Builds QuantizedConnections corresponding to input Quantum

Attributes Documentation

allConnections = {}
initInputs = frozenset({})
initOutputs = frozenset({})
inputs = frozenset({})
outputs = frozenset({})
prerequisiteInputs = frozenset({})

Methods Documentation

adjustQuantum(datasetRefMap: lsst.daf.butler.NamedKeyDict[lsst.daf.butler.DatasetType, Set[lsst.daf.butler.DatasetRef]])lsst.daf.butler.NamedKeyDict[lsst.daf.butler.DatasetType, Set[lsst.daf.butler.DatasetRef]]

Override to make adjustments to lsst.daf.butler.DatasetRef objects in the lsst.daf.butler.core.Quantum during the graph generation stage of the activator.

The base class implementation simply checks that input connections with multiple set to False have no more than one dataset.

Parameters
datasetRefMapNamedKeyDict

Mapping from dataset type to a set of lsst.daf.butler.DatasetRef objects

Returns
datasetRefMapNamedKeyDict

Modified mapping of input with possibly adjusted lsst.daf.butler.DatasetRef objects.

Raises
ScalarError

Raised if any Input or PrerequisiteInput connection has multiple set to False, but multiple datasets.

Exception

Overrides of this function have the option of raising an Exception if a field in the input does not satisfy a need for a corresponding pipelineTask, i.e. no reference catalogs are found.

buildDatasetRefs(quantum: lsst.daf.butler.Quantum)Tuple[lsst.pipe.base.InputQuantizedConnection, lsst.pipe.base.OutputQuantizedConnection]

Builds QuantizedConnections corresponding to input Quantum

Parameters
quantumlsst.daf.butler.Quantum

Quantum object which defines the inputs and outputs for a given unit of processing

Returns
retValtuple of (InputQuantizedConnection,

OutputQuantizedConnection) Namespaces mapping attribute names (identifiers of connections) to butler references defined in the input lsst.daf.butler.Quantum