QuantumGraphBuilder¶
- class lsst.pipe.base.quantum_graph_builder.QuantumGraphBuilder(pipeline_graph: PipelineGraph, butler: Butler, *, input_collections: Sequence[str] | None = None, output_run: str | None = None, skip_existing_in: Sequence[str] = (), clobber: bool = False)¶
- Bases: - ABC- An abstract base class for building - QuantumGraphobjects from a pipeline.- Parameters:
- pipeline_graphpipeline_graph.PipelineGraph
- Pipeline to build a - QuantumGraphfrom, as a graph. Will be resolved in-place with the given butler (any existing resolution is ignored).
- butlerlsst.daf.butler.Butler
- Client for the data repository. Should be read-only. 
- input_collectionsSequence[str], optional
- Collections to search for overall-input datasets. If not provided, - butler.collectionsis used (and must not be empty).
- output_runstr, optional
- Output - RUNcollection. If not provided,- butler.runis used (and must not be- None).
- skip_existing_inSequence[str], optional
- Collections to search for outputs that already exist for the purpose of skipping quanta that have already been run. 
- clobberbool, optional
- Whether to raise if predicted outputs already exist in - output_run(not including those quanta that would be skipped because they’ve already been run). This never actually clobbers outputs; it just informs the graph generation algorithm whether execution will run with clobbering enabled. This is ignored if- output_rundoes not exist.
 
- pipeline_graph
 - Notes - Constructing a - QuantumGraphBuilderwill run queries for existing datasets with empty data IDs (including but not limited to init inputs and outputs), in addition to resolving the given pipeline graph and testing for existence of the- outputrun collection.- The - buildmethod splits the pipeline graph into independent subgraphs, then calls the abstract method- process_subgraphon each, to allow concrete implementations to populate the rough graph structure (the- QuantumGraphSkeletonclass), including searching for existing datasets. The- buildmethod then:- assembles - lsst.daf.butler.Quantuminstances from all data IDs in the skeleton;
- looks for existing outputs found in - skip_existing_into see if any quanta should be skipped;
- calls - PipelineTaskConnections.adjustQuantumon all quanta, adjusting downstream quanta appropriately when preliminary predicted outputs are rejected (pruning nodes that will not have the inputs they need to run);
- attaches datastore records and registry dataset types to the graph. 
 - In addition to implementing - process_subgraph, derived classes are generally expected to add new construction keyword-only arguments to control the data IDs of the quantum graph, while forwarding all of the arguments defined in the base class to- super.- Attributes Summary - Definitions of all data dimensions. - Methods Summary - build([metadata, attach_datastore_records])- Build the quantum graph. - process_subgraph(subgraph)- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Attributes Documentation - universe¶
- Definitions of all data dimensions. 
 - Methods Documentation - final build(metadata: Mapping[str, Any] | None = None, attach_datastore_records: bool = True) QuantumGraph¶
- Build the quantum graph. - Parameters:
- metadataMapping, optional
- Flexible metadata to add to the quantum graph. 
- attach_datastore_recordsbool, optional
- Whether to include datastore records in the graph. Required for - lsst.daf.butler.QuantumBackedButlerexecution.
 
- metadata
- Returns:
- quantum_graphQuantumGraph
- DAG describing processing to be performed. 
 
- quantum_graph
 - Notes - External code is expected to construct a - QuantumGraphBuilderand then call this method exactly once. See class documentation for details on what it does.
 - abstract process_subgraph(subgraph: PipelineGraph) QuantumGraphSkeleton¶
- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Parameters:
- subgraphpipeline_graph.PipelineGraph
- Subset of the pipeline graph that should be processed by this call. This is always resolved and topologically sorted. It should not be modified. 
 
- subgraph
- Returns:
- skeletonquantum_graph_skeleton.QuantumGraphSkeleton
- Class representing an initial quantum graph. See - quantum_graph_skeleton.QuantumGraphSkeletondocs for details. After this is returned, the object may be modified in-place in unspecified ways.
 
- skeleton
 - Notes - The - quantum_graph_skeleton.QuantumGraphSkeletonshould associate- DatasetRefobjects with nodes for existing datasets. In particular:- quantum_graph_skeleton.QuantumGraphSkeleton.set_dataset_refmust be used to associate existing datasets with all overall-input dataset nodes in the skeleton by querying- input_collections. This includes all standard input nodes and any prerequisite nodes added by the method (prerequisite nodes may also be left out entirely, as the base class can add them later, albeit possibly less efficiently).
- quantum_graph_skeleton.QuantumGraphSkeleton.set_output_for_skipmust be used to associate existing datasets with output dataset nodes by querying- skip_existing_in.
- quantum_graph_skeleton.QuantumGraphSkeleton.add_output_in_the_waymust be used to associated existing outputs with output dataset nodes by querying- output_runif- output_run_existsis- True. Note that the presence of such datasets is not automatically an error, even if- clobberis- False, as these may be quanta that will be skipped.
 - DatasetRefobjects for existing datasets with empty data IDs in all of the above categories may be found in the- empty_dimensions_datasetsattribute, as these are queried for prior to this call by the base class, but associating them with graph nodes is still this method’s responsibility.- Dataset types should never be components and should always use the “common” storage class definition in - pipeline_graph.DatasetTypeNode(which is the data repository definition when the dataset type is registered).