ResourceUsageQuantumGraphBuilder¶

class lsst.analysis.tools.tasks.ResourceUsageQuantumGraphBuilder(butler: Butler, *, dataset_type_names: Iterable[str] | None = None, where: str = '', input_collections: Sequence[str] | None = None, output_run: str | None = None, skip_existing_in: Sequence[str] = (), clobber: bool = False)¶

Bases: QuantumGraphBuilder

Custom quantum graph generator and pipeline builder for resource usage summary tasks.

Parameters:

butlerlsst.daf.butler.Butler: Butler client to query for inputs and dataset types.
dataset_type_namesIterable [ str ], optional: Iterable of dataset type names or shell-style glob patterns for the metadata datasets to be used as input. Default is all datasets ending with _metadata (other than the resource-usage summary tasks’ own metadata outputs, where are always ignored). A gather-resource task with a single quantum is created for each matching metadata dataset.
wherestr, optional: Data ID expression that constrains the input metadata datasets.
input_collectionsSequence [ str ], optional: Sequence of collections to search for inputs. If not provided, butler.collections is used and must not be empty.
output_runstr, optional: Output RUN collection name. If not provided, butler.run is used and must not be None.
skip_existing_inSequence [ str ], optional: Sequence of collections to search for outputs, allowing quanta whose outputs exist to be skipped.
clobberbool, optional: Whether execution of this quantum graph will permit clobbering. If False (default), existing outputs in output_run are an error unless skip_existing_in will cause those quanta to be skipped.

Notes

The resource usage summary tasks cannot easily be added to a regular pipeline, as it’s much more natural to have the gather tasks run automatically on all other tasks. And we can generate a quantum graph for these particular tasks much more efficiently than the general-purpose algorithm could.

Attributes Summary

universe

Definitions of all data dimensions.

Methods Summary

`build`([metadata, attach_datastore_records])	Build the quantum graph, returning an old `QuantumGraph` instance.
`finish`([output, metadata, ...])	Return quantum graph components that can be used to save or construct a `PredictedQuantumGraph` instance.
`main`()	Run the command-line interface for this quantum-graph builder.
`make_argument_parser`()	Make the argument parser for the command-line interface.
`process_subgraph`(subgraph)	Build the rough structure for an independent subset of the `QuantumGraph` and query for relevant existing datasets.

Attributes Documentation

universe¶: Definitions of all data dimensions.

Methods Documentation

build(metadata: Mapping[str, Any] | None = None, attach_datastore_records: bool = True) → QuantumGraph¶

Build the quantum graph, returning an old QuantumGraph instance.

Parameters:

metadataMapping, optional: Flexible metadata to add to the quantum graph.
attach_datastore_recordsbool, optional: Whether to include datastore records in the graph. Required for lsst.daf.butler.QuantumBackedButler execution.

Returns:

quantum_graphQuantumGraph: DAG describing processing to be performed.

Notes

External code is expected to construct a QuantumGraphBuilder and then call this method exactly once. See class documentation for details on what it does.

finish(output: str | None = None, metadata: Mapping[str, Any] | None = None, attach_datastore_records: bool = True) → PredictedQuantumGraphComponents¶

Return quantum graph components that can be used to save or construct a PredictedQuantumGraph instance.

Parameters:

outputstr or None, optional: Output CHAINED collection that combines the input and output collections.
metadataMapping, optional: Mapping of JSON-friendly metadata. Collection information, the current user, and the current timestamp are automatically included.
attach_datastore_recordsbool, optional: Whether to include datastore records for overall inputs for QuantumBackedButler.

Returns:

componentsquantum_graph.PredictedQuantumGraphComponents: Components that can be used to construct a graph object and/or save it to disk.

classmethod main() → None¶

Run the command-line interface for this quantum-graph builder.

This function provides the implementation for the build-gather-resource-usage-qg script.

classmethod make_argument_parser() → ArgumentParser¶: Make the argument parser for the command-line interface.

process_subgraph(subgraph: PipelineGraph) → QuantumGraphSkeleton¶

Build the rough structure for an independent subset of the QuantumGraph and query for relevant existing datasets.

Parameters:

subgraphpipeline_graph.PipelineGraph: Subset of the pipeline graph that should be processed by this call. This is always resolved and topologically sorted. It should not be modified.

Returns:

skeletonquantum_graph_skeleton.QuantumGraphSkeleton: Class representing an initial quantum graph. See quantum_graph_skeleton.QuantumGraphSkeleton docs for details. After this is returned, the object may be modified in-place in unspecified ways.

Notes

The quantum_graph_skeleton.QuantumGraphSkeleton should associate lsst.daf.butler.DatasetRef objects with nodes for existing datasets. In particular:

quantum_graph_skeleton.QuantumGraphSkeleton.set_dataset_ref must be used to associate existing datasets with all overall-input dataset nodes in the skeleton by querying input_collections. This includes all standard input nodes and any prerequisite nodes added by the method (prerequisite nodes may also be left out entirely, as the base class can add them later, albeit possibly less efficiently).
quantum_graph_skeleton.QuantumGraphSkeleton.set_output_for_skip must be used to associate existing datasets with output dataset nodes by querying skip_existing_in.
quantum_graph_skeleton.QuantumGraphSkeleton.add_output_in_the_way must be used to associated existing outputs with output dataset nodes by querying output_run if output_run_exists is True. Note that the presence of such datasets is not automatically an error, even if clobber is False, as these may be quanta that will be skipped.

lsst.daf.butler.DatasetRef objects for existing datasets with empty data IDs in all of the above categories may be found in the empty_dimensions_datasets attribute, as these are queried for prior to this call by the base class, but associating them with graph nodes is still this method’s responsibility.

Dataset types should never be components and should always use the “common” storage class definition in pipeline_graph.DatasetTypeNode (which is the data repository definition when the dataset type is registered).

Navigation

ResourceUsageQuantumGraphBuilder¶