ResourceUsageQuantumGraphBuilder¶
- class lsst.analysis.tools.tasks.ResourceUsageQuantumGraphBuilder(butler: Butler, *, dataset_type_names: Iterable[str] | None = None, where: str = '', input_collections: Sequence[str] | None = None, output_run: str | None = None, skip_existing_in: Sequence[str] = (), clobber: bool = False)¶
- Bases: - QuantumGraphBuilder- Custom quantum graph generator and pipeline builder for resource usage summary tasks. - Parameters:
- butlerlsst.daf.butler.Butler
- Butler client to query for inputs and dataset types. 
- dataset_type_namesIterable[str], optional
- Iterable of dataset type names or shell-style glob patterns for the metadata datasets to be used as input. Default is all datasets ending with - _metadata(other than the resource-usage summary tasks’ own metadata outputs, where are always ignored). A gather-resource task with a single quantum is created for each matching metadata dataset.
- wherestr, optional
- Data ID expression that constrains the input metadata datasets. 
- input_collectionsSequence[str], optional
- Sequence of collections to search for inputs. If not provided, - butler.collectionsis used and must not be empty.
- output_runstr, optional
- Output - RUNcollection name. If not provided,- butler.runis used and must not be- None.
- skip_existing_inSequence[str], optional
- Sequence of collections to search for outputs, allowing quanta whose outputs exist to be skipped. 
- clobberbool, optional
- Whether execution of this quantum graph will permit clobbering. If - False(default), existing outputs in- output_runare an error unless- skip_existing_inwill cause those quanta to be skipped.
 
- butler
 - Notes - The resource usage summary tasks cannot easily be added to a regular pipeline, as it’s much more natural to have the gather tasks run automatically on all other tasks. And we can generate a quantum graph for these particular tasks much more efficiently than the general-purpose algorithm could. - Attributes Summary - Definitions of all data dimensions. - Methods Summary - build([metadata])- Build the quantum graph. - main()- Run the command-line interface for this quantum-graph builder. - Make the argument parser for the command-line interface. - process_subgraph(subgraph)- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Attributes Documentation - universe¶
- Definitions of all data dimensions. 
 - Methods Documentation - build(metadata: Mapping[str, Any] | None = None) QuantumGraph¶
- Build the quantum graph. - Parameters:
- metadataMapping, optional
- Flexible metadata to add to the quantum graph. 
 
- metadata
- Returns:
- quantum_graphQuantumGraph
- DAG describing processing to be performed. 
 
- quantum_graph
 - Notes - External code is expected to construct a - QuantumGraphBuilderand then call this method exactly once. See class documentation for details on what it does.
 - classmethod main() None¶
- Run the command-line interface for this quantum-graph builder. - This function provides the implementation for the - build-gather-resource-usage-qgscript.
 - classmethod make_argument_parser() ArgumentParser¶
- Make the argument parser for the command-line interface. 
 - process_subgraph(subgraph: PipelineGraph) QuantumGraphSkeleton¶
- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Parameters:
- subgraphpipeline_graph.PipelineGraph
- Subset of the pipeline graph that should be processed by this call. This is always resolved and topologically sorted. It should not be modified. 
 
- subgraph
- Returns:
- skeletonquantum_graph_skeleton.QuantumGraphSkeleton
- Class representing an initial quantum graph. See - quantum_graph_skeleton.QuantumGraphSkeletondocs for details. After this is returned, the object may be modified in-place in unspecified ways.
 
- skeleton
 - Notes - In addition to returning a - quantum_graph_skeleton.QuantumGraphSkeleton, this method should populate the- existing_datasetsstructure by querying for all relevant datasets with non-empty data IDs (those with empty data IDs will already be present). In particular:- inputsmust always be populated with all overall-input datasets (but not prerequisites), by querying- input_collections;
- outputs_for_skipmust be populated with any intermediate our output datasets present in- skip_existing_in(it can be ignored if- skip_existing_inis empty);
- outputs_in_the_waymust be populated with any intermediate or output datasets present in- output_run, if- output_run_exists(it can be ignored if- output_run_existsis- False). Note that the presence of such datasets is not automatically an error, even if- clobber is `False, as these may be quanta that will be skipped.
- inputsmust be populated with all prerequisite-input datasets that were included in the skeleton, by querying- input_collections(not all prerequisite inputs need to be included in the skeleton, but the base class can only use per-quantum queries to find them, and that can be slow when there are many quanta).
 - Dataset types should never be components and should always use the “common” storage class definition in - pipeline_graph.DatasetTypeNode(which is the data repository definition when the dataset type is registered).