DiaPipelineConnections

class lsst.ap.association.DiaPipelineConnections(*, config: PipelineTaskConfig | None = None)

Bases: PipelineTaskConnections

Butler connections for DiaPipelineTask.

Attributes Summary

allConnections

Mapping holding all connection attributes.

apdbMarker

Connection for output dataset.

associatedDiaSources

Connection for output dataset.

defaultTemplates

deprecatedTemplates

diaForcedSources

Connection for output dataset.

diaObjects

Connection for output dataset.

diaSourceTable

Class used for declaring PipelineTask input connections

diffIm

Class used for declaring PipelineTask input connections

dimensions

Set of dimension names that define the unit of work for this task.

exposure

Class used for declaring PipelineTask input connections

initInputs

Set with the names of all InitInput connection attributes.

initOutputs

Set with the names of all InitOutput connection attributes.

inputs

Set with the names of all connectionTypes.Input connection attributes.

longTrailedSources

Connection for output dataset.

outputs

Set with the names of all Output connection attributes.

prerequisiteInputs

Set with the names of all PrerequisiteInput connection attributes.

solarSystemObjectTable

Class used for declaring PipelineTask input connections

template

Class used for declaring PipelineTask input connections

Methods Summary

adjustQuantum(inputs, outputs, label, dataId)

Override to make adjustments to lsst.daf.butler.DatasetRef objects in the lsst.daf.butler.core.Quantum during the graph generation stage of the activator.

buildDatasetRefs(quantum)

Build QuantizedConnection corresponding to input Quantum.

getSpatialBoundsConnections()

Return the names of regular input and output connections whose data IDs should be used to compute the spatial bounds of this task's quanta.

getTemporalBoundsConnections()

Return the names of regular input and output connections whose data IDs should be used to compute the temporal bounds of this task's quanta.

Attributes Documentation

allConnections: Mapping[str, BaseConnection] = {'apdbMarker': Output(name='apdb_marker', storageClass='Config', doc='Marker dataset storing the configuration of the Apdb for each visit/detector. Used to signal the completion of the pipeline.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False), 'associatedDiaSources': Output(name='{fakesType}{coaddName}Diff_assocDiaSrc', storageClass='DataFrame', doc='Optional output storing the DiaSource catalog after matching, calibration, and standardization for insertion into the Apdb.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False), 'diaForcedSources': Output(name='{fakesType}{coaddName}Diff_diaForcedSrc', storageClass='DataFrame', doc='Optional output storing the forced sources computed at the diaObject positions.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False), 'diaObjects': Output(name='{fakesType}{coaddName}Diff_diaObject', storageClass='DataFrame', doc='Optional output storing the updated diaObjects associated to these sources.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False), 'diaSourceTable': Input(name='{fakesType}{coaddName}Diff_diaSrcTable', storageClass='DataFrame', doc='Catalog of calibrated DiaSources.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False, deferLoad=False, minimum=1, deferGraphConstraint=False), 'diffIm': Input(name='{fakesType}{coaddName}Diff_differenceExp', storageClass='ExposureF', doc='Difference image on which the DiaSources were detected.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False, deferLoad=False, minimum=1, deferGraphConstraint=False), 'exposure': Input(name='{fakesType}calexp', storageClass='ExposureF', doc='Calibrated exposure differenced with a template image during image differencing.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False, deferLoad=False, minimum=1, deferGraphConstraint=False), 'longTrailedSources': Output(name='{fakesType}{coaddName}Diff_longTrailedSrc', storageClass='DataFrame', doc='Optional output temporarily storing long trailed diaSources.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False), 'solarSystemObjectTable': Input(name='visitSsObjects', storageClass='DataFrame', doc='Catalog of SolarSolarSystem objects expected to be observable in this detectorVisit.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit'), isCalibration=False, deferLoad=False, minimum=1, deferGraphConstraint=False), 'template': Input(name='{fakesType}{coaddName}Diff_templateExp', storageClass='ExposureF', doc='Warped template used to create `subtractedExposure`. Not PSF matched.', multiple=False, deprecated=None, _deprecation_context='', dimensions=('instrument', 'visit', 'detector'), isCalibration=False, deferLoad=False, minimum=1, deferGraphConstraint=False)}

Mapping holding all connection attributes.

This is a read-only view that is automatically updated when connection attributes are added, removed, or replaced in __init__. It is also updated after __init__ completes to reflect changes in inputs, prerequisiteInputs, outputs, initInputs, and initOutputs.

apdbMarker

Connection for output dataset.

associatedDiaSources

Connection for output dataset.

defaultTemplates = {'coaddName': 'deep', 'fakesType': ''}
deprecatedTemplates = {}
diaForcedSources

Connection for output dataset.

diaObjects

Connection for output dataset.

diaSourceTable

Class used for declaring PipelineTask input connections

Parameters:
namestr

The default name used to identify the dataset type

storageClassstr

The storage class used when (un)/persisting the dataset type

multiplebool

Indicates if this connection should expect to contain multiple objects of the given dataset type. Tasks with more than one connection with multiple=True with the same dimensions may want to implement PipelineTaskConnections.adjustQuantum to ensure those datasets are consistent (i.e. zip-iterable) in PipelineTask.runQuantum and notify the execution system as early as possible of outputs that will not be produced because the corresponding input is missing.

dimensionsiterable of str

The lsst.daf.butler.Butler lsst.daf.butler.Registry dimensions used to identify the dataset type identified by the specified name

deferLoadbool

Indicates that this dataset type will be loaded as a lsst.daf.butler.DeferredDatasetHandle. PipelineTasks can use this object to load the object at a later time.

minimumbool

Minimum number of datasets required for this connection, per quantum. This is checked in the base implementation of PipelineTaskConnections.adjustQuantum, which raises NoWorkFound if the minimum is not met for Input connections (causing the quantum to be pruned, skipped, or never created, depending on the context), and FileNotFoundError for PrerequisiteInput connections (causing QuantumGraph generation to fail). PipelineTask implementations may provide custom adjustQuantum implementations for more fine-grained or configuration-driven constraints, as long as they are compatible with this minium.

deferGraphConstraint: `bool`, optional

If True, do not include this dataset type’s existence in the initial query that starts the QuantumGraph generation process. This can be used to make QuantumGraph generation faster by avoiding redundant datasets, and in certain cases it can (along with careful attention to which tasks are included in the same QuantumGraph) be used to work around the QuantumGraph generation algorithm’s inflexible handling of spatial overlaps. This option has no effect when the connection is not an overall input of the pipeline (or subset thereof) for which a graph is being created, and it never affects the ordering of quanta.

Raises:
TypeError

Raised if minimum is greater than one but multiple=False.

NotImplementedError

Raised if minimum is zero for a regular Input connection; this is not currently supported by our QuantumGraph generation algorithm.

diffIm

Class used for declaring PipelineTask input connections

Parameters:
namestr

The default name used to identify the dataset type

storageClassstr

The storage class used when (un)/persisting the dataset type

multiplebool

Indicates if this connection should expect to contain multiple objects of the given dataset type. Tasks with more than one connection with multiple=True with the same dimensions may want to implement PipelineTaskConnections.adjustQuantum to ensure those datasets are consistent (i.e. zip-iterable) in PipelineTask.runQuantum and notify the execution system as early as possible of outputs that will not be produced because the corresponding input is missing.

dimensionsiterable of str

The lsst.daf.butler.Butler lsst.daf.butler.Registry dimensions used to identify the dataset type identified by the specified name

deferLoadbool

Indicates that this dataset type will be loaded as a lsst.daf.butler.DeferredDatasetHandle. PipelineTasks can use this object to load the object at a later time.

minimumbool

Minimum number of datasets required for this connection, per quantum. This is checked in the base implementation of PipelineTaskConnections.adjustQuantum, which raises NoWorkFound if the minimum is not met for Input connections (causing the quantum to be pruned, skipped, or never created, depending on the context), and FileNotFoundError for PrerequisiteInput connections (causing QuantumGraph generation to fail). PipelineTask implementations may provide custom adjustQuantum implementations for more fine-grained or configuration-driven constraints, as long as they are compatible with this minium.

deferGraphConstraint: `bool`, optional

If True, do not include this dataset type’s existence in the initial query that starts the QuantumGraph generation process. This can be used to make QuantumGraph generation faster by avoiding redundant datasets, and in certain cases it can (along with careful attention to which tasks are included in the same QuantumGraph) be used to work around the QuantumGraph generation algorithm’s inflexible handling of spatial overlaps. This option has no effect when the connection is not an overall input of the pipeline (or subset thereof) for which a graph is being created, and it never affects the ordering of quanta.

Raises:
TypeError

Raised if minimum is greater than one but multiple=False.

NotImplementedError

Raised if minimum is zero for a regular Input connection; this is not currently supported by our QuantumGraph generation algorithm.

dimensions: set[str] = {'detector', 'instrument', 'visit'}

Set of dimension names that define the unit of work for this task.

Required and implied dependencies will automatically be expanded later and need not be provided.

This may be replaced or modified in __init__ to change the dimensions of the task. After __init__ it will be a frozenset and may not be replaced.

exposure

Class used for declaring PipelineTask input connections

Parameters:
namestr

The default name used to identify the dataset type

storageClassstr

The storage class used when (un)/persisting the dataset type

multiplebool

Indicates if this connection should expect to contain multiple objects of the given dataset type. Tasks with more than one connection with multiple=True with the same dimensions may want to implement PipelineTaskConnections.adjustQuantum to ensure those datasets are consistent (i.e. zip-iterable) in PipelineTask.runQuantum and notify the execution system as early as possible of outputs that will not be produced because the corresponding input is missing.

dimensionsiterable of str

The lsst.daf.butler.Butler lsst.daf.butler.Registry dimensions used to identify the dataset type identified by the specified name

deferLoadbool

Indicates that this dataset type will be loaded as a lsst.daf.butler.DeferredDatasetHandle. PipelineTasks can use this object to load the object at a later time.

minimumbool

Minimum number of datasets required for this connection, per quantum. This is checked in the base implementation of PipelineTaskConnections.adjustQuantum, which raises NoWorkFound if the minimum is not met for Input connections (causing the quantum to be pruned, skipped, or never created, depending on the context), and FileNotFoundError for PrerequisiteInput connections (causing QuantumGraph generation to fail). PipelineTask implementations may provide custom adjustQuantum implementations for more fine-grained or configuration-driven constraints, as long as they are compatible with this minium.

deferGraphConstraint: `bool`, optional

If True, do not include this dataset type’s existence in the initial query that starts the QuantumGraph generation process. This can be used to make QuantumGraph generation faster by avoiding redundant datasets, and in certain cases it can (along with careful attention to which tasks are included in the same QuantumGraph) be used to work around the QuantumGraph generation algorithm’s inflexible handling of spatial overlaps. This option has no effect when the connection is not an overall input of the pipeline (or subset thereof) for which a graph is being created, and it never affects the ordering of quanta.

Raises:
TypeError

Raised if minimum is greater than one but multiple=False.

NotImplementedError

Raised if minimum is zero for a regular Input connection; this is not currently supported by our QuantumGraph generation algorithm.

initInputs: set[str] = frozenset({})

Set with the names of all InitInput connection attributes.

See inputs for additional information.

initOutputs: set[str] = frozenset({})

Set with the names of all InitOutput connection attributes.

See inputs for additional information.

inputs: set[str] = frozenset({'diaSourceTable', 'diffIm', 'exposure', 'solarSystemObjectTable', 'template'})

Set with the names of all connectionTypes.Input connection attributes.

This is updated automatically as class attributes are added, removed, or replaced in __init__. Removing entries from this set will cause those connections to be removed after __init__ completes, but this is supported only for backwards compatibility; new code should instead just delete the collection attributed directly. After __init__ this will be a frozenset and may not be replaced.

longTrailedSources

Connection for output dataset.

outputs: set[str] = frozenset({'apdbMarker', 'associatedDiaSources', 'diaForcedSources', 'diaObjects', 'longTrailedSources'})

Set with the names of all Output connection attributes.

See inputs for additional information.

prerequisiteInputs: set[str] = frozenset({})

Set with the names of all PrerequisiteInput connection attributes.

See inputs for additional information.

solarSystemObjectTable

Class used for declaring PipelineTask input connections

Parameters:
namestr

The default name used to identify the dataset type

storageClassstr

The storage class used when (un)/persisting the dataset type

multiplebool

Indicates if this connection should expect to contain multiple objects of the given dataset type. Tasks with more than one connection with multiple=True with the same dimensions may want to implement PipelineTaskConnections.adjustQuantum to ensure those datasets are consistent (i.e. zip-iterable) in PipelineTask.runQuantum and notify the execution system as early as possible of outputs that will not be produced because the corresponding input is missing.

dimensionsiterable of str

The lsst.daf.butler.Butler lsst.daf.butler.Registry dimensions used to identify the dataset type identified by the specified name

deferLoadbool

Indicates that this dataset type will be loaded as a lsst.daf.butler.DeferredDatasetHandle. PipelineTasks can use this object to load the object at a later time.

minimumbool

Minimum number of datasets required for this connection, per quantum. This is checked in the base implementation of PipelineTaskConnections.adjustQuantum, which raises NoWorkFound if the minimum is not met for Input connections (causing the quantum to be pruned, skipped, or never created, depending on the context), and FileNotFoundError for PrerequisiteInput connections (causing QuantumGraph generation to fail). PipelineTask implementations may provide custom adjustQuantum implementations for more fine-grained or configuration-driven constraints, as long as they are compatible with this minium.

deferGraphConstraint: `bool`, optional

If True, do not include this dataset type’s existence in the initial query that starts the QuantumGraph generation process. This can be used to make QuantumGraph generation faster by avoiding redundant datasets, and in certain cases it can (along with careful attention to which tasks are included in the same QuantumGraph) be used to work around the QuantumGraph generation algorithm’s inflexible handling of spatial overlaps. This option has no effect when the connection is not an overall input of the pipeline (or subset thereof) for which a graph is being created, and it never affects the ordering of quanta.

Raises:
TypeError

Raised if minimum is greater than one but multiple=False.

NotImplementedError

Raised if minimum is zero for a regular Input connection; this is not currently supported by our QuantumGraph generation algorithm.

template

Class used for declaring PipelineTask input connections

Parameters:
namestr

The default name used to identify the dataset type

storageClassstr

The storage class used when (un)/persisting the dataset type

multiplebool

Indicates if this connection should expect to contain multiple objects of the given dataset type. Tasks with more than one connection with multiple=True with the same dimensions may want to implement PipelineTaskConnections.adjustQuantum to ensure those datasets are consistent (i.e. zip-iterable) in PipelineTask.runQuantum and notify the execution system as early as possible of outputs that will not be produced because the corresponding input is missing.

dimensionsiterable of str

The lsst.daf.butler.Butler lsst.daf.butler.Registry dimensions used to identify the dataset type identified by the specified name

deferLoadbool

Indicates that this dataset type will be loaded as a lsst.daf.butler.DeferredDatasetHandle. PipelineTasks can use this object to load the object at a later time.

minimumbool

Minimum number of datasets required for this connection, per quantum. This is checked in the base implementation of PipelineTaskConnections.adjustQuantum, which raises NoWorkFound if the minimum is not met for Input connections (causing the quantum to be pruned, skipped, or never created, depending on the context), and FileNotFoundError for PrerequisiteInput connections (causing QuantumGraph generation to fail). PipelineTask implementations may provide custom adjustQuantum implementations for more fine-grained or configuration-driven constraints, as long as they are compatible with this minium.

deferGraphConstraint: `bool`, optional

If True, do not include this dataset type’s existence in the initial query that starts the QuantumGraph generation process. This can be used to make QuantumGraph generation faster by avoiding redundant datasets, and in certain cases it can (along with careful attention to which tasks are included in the same QuantumGraph) be used to work around the QuantumGraph generation algorithm’s inflexible handling of spatial overlaps. This option has no effect when the connection is not an overall input of the pipeline (or subset thereof) for which a graph is being created, and it never affects the ordering of quanta.

Raises:
TypeError

Raised if minimum is greater than one but multiple=False.

NotImplementedError

Raised if minimum is zero for a regular Input connection; this is not currently supported by our QuantumGraph generation algorithm.

Methods Documentation

adjustQuantum(inputs, outputs, label, dataId)

Override to make adjustments to lsst.daf.butler.DatasetRef objects in the lsst.daf.butler.core.Quantum during the graph generation stage of the activator.

This implementation checks to make sure that the filters in the dataset are compatible with AP processing as set by the Apdb/DPDD schema.

Parameters:
inputsdict

Dictionary whose keys are an input (regular or prerequisite) connection name and whose values are a tuple of the connection instance and a collection of associated DatasetRef objects. The exact type of the nested collections is unspecified; it can be assumed to be multi-pass iterable and support len and in, but it should not be mutated in place. In contrast, the outer dictionaries are guaranteed to be temporary copies that are true dict instances, and hence may be modified and even returned; this is especially useful for delegating to super (see notes below).

outputsdict

Dict of output datasets, with the same structure as inputs.

labelstr

Label for this task in the pipeline (should be used in all diagnostic messages).

data_idlsst.daf.butler.DataCoordinate

Data ID for this quantum in the pipeline (should be used in all diagnostic messages).

Returns:
adjusted_inputsdict

Dict of the same form as inputs with updated containers of input DatasetRef objects. Connections that are not changed should not be returned at all. Datasets may only be removed, not added. Nested collections may be of any multi-pass iterable type, and the order of iteration will set the order of iteration within PipelineTask.runQuantum.

adjusted_outputsdict

Dict of updated output datasets, with the same structure and interpretation as adjusted_inputs.

Raises:
ScalarError

Raised if any Input or PrerequisiteInput connection has multiple set to False, but multiple datasets.

NoWorkFound

Raised to indicate that this quantum should not be run; not enough datasets were found for a regular Input connection, and the quantum should be pruned or skipped.

FileNotFoundError

Raised to cause QuantumGraph generation to fail (with the message included in this exception); not enough datasets were found for a PrerequisiteInput connection.

buildDatasetRefs(quantum: Quantum) tuple[lsst.pipe.base.connections.InputQuantizedConnection, lsst.pipe.base.connections.OutputQuantizedConnection]

Build QuantizedConnection corresponding to input Quantum.

Parameters:
quantumlsst.daf.butler.Quantum

Quantum object which defines the inputs and outputs for a given unit of processing.

Returns:
retValtuple of (InputQuantizedConnection,

OutputQuantizedConnection) Namespaces mapping attribute names (identifiers of connections) to butler references defined in the input lsst.daf.butler.Quantum.

getSpatialBoundsConnections() Iterable[str]

Return the names of regular input and output connections whose data IDs should be used to compute the spatial bounds of this task’s quanta.

The spatial bound for a quantum is defined as the union of the regions of all data IDs of all connections returned here, along with the region of the quantum data ID (if the task has spatial dimensions).

Returns:
connection_namescollections.abc.Iterable [ str ]

Names of collections with spatial dimensions. These are the task-internal connection names, not butler dataset type names.

Notes

The spatial bound is used to search for prerequisite inputs that have skypix dimensions. The default implementation returns an empty iterable, which is usually sufficient for tasks with spatial dimensions, but if a task’s inputs or outputs are associated with spatial regions that extend beyond the quantum data ID’s region, this method may need to be overridden to expand the set of prerequisite inputs found.

Tasks that do not have spatial dimensions that have skypix prerequisite inputs should always override this method, as the default spatial bounds otherwise cover the full sky.

getTemporalBoundsConnections() Iterable[str]

Return the names of regular input and output connections whose data IDs should be used to compute the temporal bounds of this task’s quanta.

The temporal bound for a quantum is defined as the union of the timespans of all data IDs of all connections returned here, along with the timespan of the quantum data ID (if the task has temporal dimensions).

Returns:
connection_namescollections.abc.Iterable [ str ]

Names of collections with temporal dimensions. These are the task-internal connection names, not butler dataset type names.

Notes

The temporal bound is used to search for prerequisite inputs that are calibration datasets. The default implementation returns an empty iterable, which is usually sufficient for tasks with temporal dimensions, but if a task’s inputs or outputs are associated with timespans that extend beyond the quantum data ID’s timespan, this method may need to be overridden to expand the set of prerequisite inputs found.

Tasks that do not have temporal dimensions that do not implement this method will use an infinite timespan for any calibration lookups.