QuantumProvenanceData

class lsst.daf.butler.QuantumProvenanceData(predicted_inputs: Set[Union[int, uuid.UUID]], available_inputs: Set[Union[int, uuid.UUID]], actual_inputs: Set[Union[int, uuid.UUID]], predicted_outputs: Set[Union[int, uuid.UUID]], actual_outputs: Set[Union[int, uuid.UUID]], locations: Dict[str, Set[Union[int, uuid.UUID]]], records: Dict[str, List[lsst.daf.butler.core.storedFileInfo.StoredDatastoreItemInfo]])

Bases: object

A serializable struct for per-quantum provenance information and datastore records.

Notes

This class slightly duplicates information from the Quantum class itself (the predicted_inputs and predicted_outputs sets should have the same IDs present in Quantum.inputs and Quantum.outputs), but overall it assumes the original Quantum is also available to reconstruct the complete provenance (e.g. by associating dataset IDs with data IDs, dataset types, and RUN names.

Methods Summary

collect_and_transfer(butler, quanta, provenance) Transfer output datasets from multiple quanta to a more permantent Butler repository.
from_simple(simple, Any], universe, registry) Make an instance of this class from serialized data.
to_simple(minimal) Make representation of the provenance suitable for serialization.

Methods Documentation

static collect_and_transfer(butler: Butler, quanta: Iterable[Quantum], provenance: Iterable[QuantumProvenanceData]) → None

Transfer output datasets from multiple quanta to a more permantent Butler repository.

Parameters:
butler : Butler

Full butler representing the data repository to transfer datasets to.

quanta : Iterable [ Quantum ]

Iterable of Quantum objects that carry information about predicted outputs. May be a single-pass iterator.

provenance : Iterable [ QuantumProvenanceData ]

Provenance and datastore data for each of the given quanta, in the same order. May be a single-pass iterator.

Notes

Input-output provenance data is not actually transferred yet, because Registry has no place to store it.

This method probably works most efficiently if run on all quanta for a single task label at once, because this will gather all datasets of a particular type together into a single vectorized Registry import. It should still behave correctly if run on smaller groups of quanta or even quanta from multiple tasks.

Currently this method transfers datastore record data unchanged, with no possibility of actually moving (e.g.) files. Datastores that are present only in execution or only in the more permanent butler are ignored.

classmethod from_simple(simple: Dict[str, Any], universe: Optional[DimensionUniverse] = None, registry: Optional[Registry] = None) → QuantumProvenanceData

Make an instance of this class from serialized data.

Implements SupportsSimple protocol.

Parameters:
data : dict

Serialized representation returned from to_simple method.

universe : DimensionUniverse, optional

Dimension universe, not used by this method.

registry : Registry, optional

Registry instance, not used by this method.

Returns:
provenance : QuantumProvenanceData

De-serialized instance of QuantumProvenanceData.

to_simple(minimal: bool = False) → Dict[str, Any]

Make representation of the provenance suitable for serialization.

Implements SupportsSimple protocol.

Parameters:
minimal : bool, optional

If True produce minimal representation, not used by this method.

Returns:
simple : dict

Representation of this instance as a simple dictionary.