AllDimensionsQuantumGraphBuilder¶
- final class lsst.pipe.base.all_dimensions_quantum_graph_builder.AllDimensionsQuantumGraphBuilder(pipeline_graph: PipelineGraph, butler: Butler, *, where: str = '', dataset_query_constraint: DatasetQueryConstraintVariant = <class 'lsst.pipe.base._datasetQueryConstraints._ALL'>, bind: Mapping[str, Any] | None = None, **kwargs: Any)¶
- Bases: - QuantumGraphBuilder- An implementation of - QuantumGraphBuilderthat uses a single large query for data IDs covering all dimensions in the pipeline.- Parameters:
- pipeline_graphpipeline_graph.PipelineGraph
- Pipeline to build a - QuantumGraphfrom, as a graph. Will be resolved in-place with the given butler (any existing resolution is ignored).
- butlerlsst.daf.butler.Butler
- Client for the data repository. Should be read-only. 
- wherestr, optional
- Butler expression language constraint to apply to all data IDs. 
- dataset_query_constraintDatasetQueryConstraintVariant, optional
- Specification of which overall-input datasets should be used to constrain the initial data ID queries. Not including an important constraint can result in catastrophically large query results that take too long to process, while including too many makes the query much more complex, increasing the chances that the database will choose a bad (sometimes catastrophically bad) query plan. 
- bindMapping, optional
- Variable substitutions for the - whereexpression.
- **kwargs
- Additional keyword arguments forwarded to - QuantumGraphBuilder.
 
- pipeline_graph
 - Notes - This is a general-purpose algorithm that delegates the problem of determining which “end” of the pipeline is more constrained (beginning by input collection contents vs. end by the - wherestring) to the database query planner, which usually does a good job.- This algorithm suffers from a serious limitation, which we refer to as the “tract slicing” problem from its most common variant: the - wherestring and general data ID intersection rules apply to all data IDs in the graph. For example, if a- tractconstraint is present in the- wherestring or an overall-input dataset, then it is impossible for any data ID that does not overlap that tract to be present anywhere in the pipeline, such as a- {visit, detector}combination where the- visitoverlaps the- tracteven if the- detectordoes not.- Attributes Summary - Definitions of all data dimensions. - Methods Summary - build([metadata, attach_datastore_records])- Build the quantum graph. - process_subgraph(subgraph)- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Attributes Documentation - universe¶
- Definitions of all data dimensions. 
 - Methods Documentation - build(metadata: Mapping[str, Any] | None = None, attach_datastore_records: bool = True) QuantumGraph¶
- Build the quantum graph. - Parameters:
- metadataMapping, optional
- Flexible metadata to add to the quantum graph. 
- attach_datastore_recordsbool, optional
- Whether to include datastore records in the graph. Required for - lsst.daf.butler.QuantumBackedButlerexecution.
 
- metadata
- Returns:
- quantum_graphQuantumGraph
- DAG describing processing to be performed. 
 
- quantum_graph
 - Notes - External code is expected to construct a - QuantumGraphBuilderand then call this method exactly once. See class documentation for details on what it does.
 - process_subgraph(subgraph: PipelineGraph) QuantumGraphSkeleton¶
- Build the rough structure for an independent subset of the - QuantumGraphand query for relevant existing datasets.- Parameters:
- subgraphpipeline_graph.PipelineGraph
- Subset of the pipeline graph that should be processed by this call. This is always resolved and topologically sorted. It should not be modified. 
 
- subgraph
- Returns:
- skeletonquantum_graph_skeleton.QuantumGraphSkeleton
- Class representing an initial quantum graph. See - quantum_graph_skeleton.QuantumGraphSkeletondocs for details. After this is returned, the object may be modified in-place in unspecified ways.
 
- skeleton
 - Notes - In addition to returning a - quantum_graph_skeleton.QuantumGraphSkeleton, this method should populate the- existing_datasetsstructure by querying for all relevant datasets with non-empty data IDs (those with empty data IDs will already be present). In particular:- inputsmust always be populated with all overall-input datasets (but not prerequisites), by querying- input_collections;
- outputs_for_skipmust be populated with any intermediate our output datasets present in- skip_existing_in(it can be ignored if- skip_existing_inis empty);
- outputs_in_the_waymust be populated with any intermediate or output datasets present in- output_run, if- output_run_exists(it can be ignored if- output_run_existsis- False). Note that the presence of such datasets is not automatically an error, even if- clobber is `False, as these may be quanta that will be skipped.
- inputsmust be populated with all prerequisite-input datasets that were included in the skeleton, by querying- input_collections(not all prerequisite inputs need to be included in the skeleton, but the base class can only use per-quantum queries to find them, and that can be slow when there are many quanta).
 - Dataset types should never be components and should always use the “common” storage class definition in - pipeline_graph.DatasetTypeNode(which is the data repository definition when the dataset type is registered).