QuantumProvenanceData¶

class lsst.daf.butler.QuantumProvenanceData(*, predicted_inputs: set[uuid.UUID], available_inputs: set[uuid.UUID], actual_inputs: set[uuid.UUID], predicted_outputs: set[uuid.UUID], actual_outputs: set[uuid.UUID], datastore_records: dict[str, lsst.daf.butler.datastore.record_data.SerializedDatastoreRecordData])¶

Bases: BaseModel

A serializable struct for per-quantum provenance information and datastore records.

Notes

This class slightly duplicates information from the Quantum class itself (the predicted_inputs and predicted_outputs sets should have the same IDs present in Quantum.inputs and Quantum.outputs), but overall it assumes the original Quantum is also available to reconstruct the complete provenance (e.g. by associating dataset IDs with data IDs, dataset types, and RUN names.

Note that pydantic method parse_raw() is not going to work correctly for this class, use direct method instead.

Attributes Summary

`model_computed_fields`	A dictionary of computed field names and their corresponding `ComputedFieldInfo` objects.
`model_config`	Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict].
`model_fields`	Metadata about the fields defined on the model, mapping of field names to [`FieldInfo`][pydantic.fields.FieldInfo].

Methods Summary

`collect_and_transfer`(butler, quanta, provenance)	Transfer output datasets from multiple quanta to a more permanent `Butler` repository.
`direct`(*, predicted_inputs, ...)	Construct an instance directly without validators.
`parse_raw`(args, *kwargs)

Attributes Documentation

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}¶: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'actual_inputs': FieldInfo(annotation=set[UUID], required=True), 'actual_outputs': FieldInfo(annotation=set[UUID], required=True), 'available_inputs': FieldInfo(annotation=set[UUID], required=True), 'datastore_records': FieldInfo(annotation=dict[str, SerializedDatastoreRecordData], required=True), 'predicted_inputs': FieldInfo(annotation=set[UUID], required=True), 'predicted_outputs': FieldInfo(annotation=set[UUID], required=True)}¶

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

Methods Documentation

static collect_and_transfer(butler: Butler, quanta: Iterable[Quantum], provenance: Iterable[QuantumProvenanceData]) → None¶

Transfer output datasets from multiple quanta to a more permanent Butler repository.

Parameters:

butlerButler: Full butler representing the data repository to transfer datasets to.
quantaIterable [ Quantum ]: Iterable of Quantum objects that carry information about predicted outputs. May be a single-pass iterator.
provenanceIterable [ QuantumProvenanceData ]: Provenance and datastore data for each of the given quanta, in the same order. May be a single-pass iterator.

Notes

Input-output provenance data is not actually transferred yet, because Registry has no place to store it.

This method probably works most efficiently if run on all quanta for a single task label at once, because this will gather all datasets of a particular type together into a single vectorized Registry import. It should still behave correctly if run on smaller groups of quanta or even quanta from multiple tasks.

Currently this method transfers datastore record data unchanged, with no possibility of actually moving (e.g.) files. Datastores that are present only in execution or only in the more permanent butler are ignored.

classmethod direct(*, predicted_inputs: Iterable[str | UUID], available_inputs: Iterable[str | UUID], actual_inputs: Iterable[str | UUID], predicted_outputs: Iterable[str | UUID], actual_outputs: Iterable[str | UUID], datastore_records: Mapping[str, Mapping]) → QuantumProvenanceData¶

Construct an instance directly without validators.

Parameters:

predicted_inputsIterable of str or uuid.UUID: The predicted inputs.
available_inputsIterable of str or uuid.UUID: The available inputs.
actual_inputsIterable of str or uuid.UUID: The actual inputs.
predicted_outputsIterable of str or uuid.UUID: The predicted outputs.
actual_outputsIterable of str or uuid.UUID: The actual outputs.
datastore_recordsMapping [ str, Mapping ]: The datastore records.

Returns:

provenanceQuantumProvenanceData: Serializable model of the quantum provenance.

Notes

This differs from the Pydantic “construct” method in that the arguments are explicitly what the model requires, and it will recurse through members, constructing them from their corresponding direct methods.

This method should only be called when the inputs are trusted.

classmethod parse_raw(*args: Any, **kwargs: Any) → QuantumProvenanceData¶

Navigation

QuantumProvenanceData¶