QuantumBackedButler¶
- 
class lsst.daf.butler.QuantumBackedButler(quantum: lsst.daf.butler.core.quantum.Quantum, dimensions: lsst.daf.butler.core.dimensions._universe.DimensionUniverse, datastore: lsst.daf.butler.core.datastore.Datastore, storageClasses: lsst.daf.butler.core.storageClass.StorageClassFactory)¶
- Bases: - lsst.daf.butler._limited_butler.LimitedButler- An implementation of - LimitedButlerintended to back execution of a single- Quantum.- Parameters: - quantum : Quantum
- Object describing the predicted input and output dataset relevant to this butler. This must have resolved - DatasetRefinstances for all inputs and outputs.
- dimensions : DimensionUniverse
- Object managing all dimension definitions. 
- datastore : Datastore
- Datastore to use for all dataset I/O and existence checks. 
- storageClasses : StorageClassFactory
- Object managing all storage class definitions. 
 - Notes - Most callers should use the - initialize- classmethodto construct new instances instead of calling the constructor directly.- QuantumBackedButleruses a SQLite database internally, in order to reuse existing- DatastoreRegistryBridgeand- OpaqueTableStorageimplementations that rely SQLAlchemy. If implementations are added in the future that don’t rely on SQLAlchemy, it should be possible to swap them in by overriding the type arguments to- initialize(though at present,- QuantumBackedButlerwould still create at least an in-memory SQLite database that would then go unused).`- We imagine - QuantumBackedButlerbeing used during (at least) batch execution to capture- Datastorerecords and save them to per-quantum files, which are also a convenient place to store provenance for eventual upload to a SQL-backed- Registry(once- Registryhas tables to store provenance, that is). These per-quantum files can be written in two ways:- The SQLite file used internally by QuantumBackedButlercan be used directly but customizing thefilenameargument toinitialize, and then transferring that file to the object store after execution completes (or fails; atry/finallypattern probably makes sense here).
- A JSON or YAML file can be written by calling extract_provenance_data, and usingpydanticmethods to write the returnedQuantumProvenanceDatato a file.
 - Note that at present, the SQLite file only contains datastore records, not provenance, but that should be easy to address (if desired) after we actually design a - Registryschema for provenance. I also suspect that we’ll want to explicitly close the SQLite file somehow before trying to transfer it. But I’m guessing we’d prefer to write the per-quantum files as JSON anyway.- Attributes Summary - GENERATION- dimensions- Structure managing all dimensions recognized by this data repository ( - DimensionUniverse).- Methods Summary - datasetExistsDirect(ref)- Return - Trueif a dataset is actually present in the Datastore.- extract_provenance_data()- Extract provenance information and datastore records from this butler. - getDirect(ref, *, parameters, Any], None] = None)- Retrieve a stored dataset. - getDirectDeferred(ref, *, parameters, …)- Create a - DeferredDatasetHandlewhich can later retrieve a dataset, from a resolved- DatasetRef.- initialize(config, str], quantum, …)- Construct a new - QuantumBackedButlerfrom repository configuration and helper types.- isWriteable()- Return - Trueif this- Butlersupports write operations.- markInputUnused(ref)- Indicate that a predicted input was not actually used when processing a - Quantum.- putDirect(obj, ref)- Store a dataset that already has a UUID and - RUNcollection.- Attributes Documentation - 
GENERATION= 3¶
 - 
dimensions¶
- Structure managing all dimensions recognized by this data repository ( - DimensionUniverse).
 - Methods Documentation - 
datasetExistsDirect(ref: lsst.daf.butler.core.datasets.ref.DatasetRef) → bool¶
- Return - Trueif a dataset is actually present in the Datastore.- Parameters: - ref : DatasetRef
- Resolved reference to a dataset. 
 - Returns: - exists : bool
- Whether the dataset exists in the Datastore. 
 
- ref : 
 - 
extract_provenance_data() → lsst.daf.butler._quantum_backed.QuantumProvenanceData¶
- Extract provenance information and datastore records from this butler. - Returns: - provenance : QuantumProvenanceData
- A serializable struct containing input/output dataset IDs and datastore records. This assumes all dataset IDs are UUIDs (just to make it easier for - pydanticto reason about the struct’s types); the rest of this class makes no such assumption, but the approach to processing in which it’s useful effectively requires UUIDs anyway.
 - Notes - QuantumBackedButlerrecords this provenance information when its methods are used, which mostly saves- PipelineTaskauthors from having to worry about while still recording very detailed information. But it has two small weaknesses:- Calling getDirectDeferredorgetDirectis enough to mark a dataset as an “actual input”, which may mark some datasets that aren’t actually used. We rely on task authors to usemarkInputUnusedto address this.
- We assume that the execution system will call datasetExistsDirecton all predicted inputs prior to execution, in order to populate the “available inputs” set. This is what I envision ‘SingleQuantumExecutordoing after we update it to use this class, but it feels fragile for this class to make such a strong assumption about how it will be used, even if I can’t think of any other executor behavior that would make sense.
 
- provenance : 
 - 
getDirect(ref: lsst.daf.butler.core.datasets.ref.DatasetRef, *, parameters: Optional[Dict[str, Any], None] = None) → Any¶
- Retrieve a stored dataset. - Unlike - Butler.get, this method allows datasets outside the Butler’s collection to be read as long as the- DatasetRefthat identifies them can be obtained separately.- Parameters: - ref : DatasetRef
- Resolved reference to an already stored dataset. 
- parameters : dict
- Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset. 
 - Returns: - obj : object
- The dataset. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.id is None, i.e. the reference is unresolved.
 
- ref : 
 - 
getDirectDeferred(ref: lsst.daf.butler.core.datasets.ref.DatasetRef, *, parameters: Optional[dict, None] = None) → lsst.daf.butler._deferredDatasetHandle.DeferredDatasetHandle¶
- Create a - DeferredDatasetHandlewhich can later retrieve a dataset, from a resolved- DatasetRef.- Parameters: - ref : DatasetRef
- Resolved reference to an already stored dataset. 
- parameters : dict
- Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset. 
 - Returns: - obj : DeferredDatasetHandle
- A handle which can be used to retrieve a dataset at a later time. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.id is None, i.e. the reference is unresolved.
 
- ref : 
 - 
classmethod initialize(config: Union[lsst.daf.butler.core.config.Config, str], quantum: lsst.daf.butler.core.quantum.Quantum, dimensions: lsst.daf.butler.core.dimensions._universe.DimensionUniverse, filename: str = ':memory:', OpaqueManagerClass: Type[lsst.daf.butler.registry.interfaces._opaque.OpaqueTableStorageManager] = <class 'lsst.daf.butler.registry.opaque.ByNameOpaqueTableStorageManager'>, BridgeManagerClass: Type[lsst.daf.butler.registry.interfaces._bridge.DatastoreRegistryBridgeManager] = <class 'lsst.daf.butler.registry.bridge.monolithic.MonolithicDatastoreRegistryBridgeManager'>, search_paths: Optional[List[str], None] = None) → lsst.daf.butler._quantum_backed.QuantumBackedButler¶
- Construct a new - QuantumBackedButlerfrom repository configuration and helper types.- Parameters: - config : Configorstr
- A butler repository root, configuration filename, or configuration instance. 
- quantum : Quantum
- Object describing the predicted input and output dataset relevant to this butler. This must have resolved - DatasetRefinstances for all inputs and outputs.
- dimensions : DimensionUniverse
- Object managing all dimension definitions. 
- filename : str, optional
- Name for the SQLite database that will back this butler; defaults to an in-memory database. 
- OpaqueManagerClass : type, optional
- A subclass of - OpaqueTableStorageManagerto use for datastore opaque records. Default is a SQL-backed implementation.
- BridgeManagerClass : type, optional
- A subclass of - DatastoreRegistryBridgeManagerto use for datastore location records. Default is a SQL-backed implementation.
- search_paths : listofstr, optional
- Additional search paths for butler configuration. 
 
- config : 
 - 
markInputUnused(ref: lsst.daf.butler.core.datasets.ref.DatasetRef) → None¶
- Indicate that a predicted input was not actually used when processing a - Quantum.- Parameters: - ref : DatasetRef
- Reference to the unused dataset. 
 - Notes - By default, a dataset is considered “actually used” if it is accessed via - getDirector a handle to it is obtained via- getDirectDeferred(even if the handle is not used). This method must be called after one of those in order to remove the dataset from the actual input list.- This method does nothing for butlers that do not store provenance information (which is the default implementation provided by the base class). 
- ref : 
 - 
putDirect(obj: Any, ref: lsst.daf.butler.core.datasets.ref.DatasetRef) → lsst.daf.butler.core.datasets.ref.DatasetRef¶
- Store a dataset that already has a UUID and - RUNcollection.- Parameters: - obj : object
- The dataset. 
- ref : DatasetRef
- Resolved reference for a not-yet-stored dataset. 
 - Returns: - ref : DatasetRef
- The same as the given, for convenience and symmetry with - Butler.put.
 - Raises: - TypeError
- Raised if the butler is read-only. 
- AmbiguousDatasetError
- Raised if - ref.id is None, i.e. the reference is unresolved.
 - Notes - Whether this method inserts the given dataset into a - Registryis implementation defined (some- LimitedButlersubclasses do not have a- Registry), but it always adds the dataset to a- Datastore, and the given- ref.idand- ref.runare always preserved.
- obj : 
 
- quantum :