DatasetProvenance

class lsst.daf.butler.DatasetProvenance(*, inputs: list[lsst.daf.butler._dataset_ref.SerializedDatasetRef] = <factory>, quantum_id: ~uuid.UUID | None = None, extras: dict[uuid.UUID, dict[str, str | int | float | bool | None | uuid.UUID]] = <factory>)

Bases: BaseModel

Provenance of a single DatasetRef.

Attributes Summary

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Methods Summary

add_extra_provenance(dataset_id, extra)

Attach extra provenance to a specific dataset.

add_input(ref)

Add an input dataset to the provenance.

from_flat_dict(prov_dict, butler)

Create a provenance object from a provenance dictionary.

model_post_init(context, /)

This function is meant to behave like a BaseModel method to initialise private attributes.

populate_cache()

strip_provenance_from_flat_dict(prov_dict)

Remove provenance keys from a mapping that had been populated by to_flat_dict.

to_flat_dict(ref, /, *[, prefix, sep, ...])

Return provenance as a flattened dictionary.

Attributes Documentation

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Methods Documentation

add_extra_provenance(dataset_id: uuid.UUID, extra: Mapping[str, _PROV_TYPES]) None

Attach extra provenance to a specific dataset.

Parameters:
dataset_iduuid.UUID

The ID of the dataset to receive this provenance.

extraMapping [ str, typing.Any ]

The extra provenance information as a dictionary. The values must be simple Python scalars or scalars that can be serialized by Pydantic and convert to a simple string value.

Notes

The keys in the extra provenance can not include provenance keys of run, id, or datasettype (in upper or lower case).

add_input(ref: DatasetRef) None

Add an input dataset to the provenance.

Parameters:
refDatasetRef

A dataset to register as an input.

classmethod from_flat_dict(prov_dict: Mapping[str, Any], butler: Butler) tuple[Self, DatasetRef | None]

Create a provenance object from a provenance dictionary.

Parameters:
prov_dictcollections.abc.Mapping

Dictionary populated by to_flat_dict.

butlerlsst.daf.butler.Butler

Butler to query to find references datasets.

Returns:
provDatasetProvenance

Provenance extracted from this object.

refDatasetRef or None

Dataset associated with this provenance. Can be None if no provenance found.

Raises:
ValueError

Raised if no provenance values are found in the dictionary.

RuntimeError

Raised if a referenced dataset is not known to the given butler.

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

populate_cache() Self
classmethod strip_provenance_from_flat_dict(prov_dict: MutableMapping[str, Any]) None

Remove provenance keys from a mapping that had been populated by to_flat_dict.

Parameters:
prov_dictcollections.abc.MutableMapping

Dictionary to modify.

to_flat_dict(ref: DatasetRef | None, /, *, prefix: str = '', sep: str = '.', simple_types: bool = False, use_upper: bool | None = None) dict[str, str | int | float | bool | None | uuid.UUID]

Return provenance as a flattened dictionary.

Parameters:
refDatasetRef or None

If given, a dataset for which this provenance is relevant and should be included.

prefixstr, optional

A prefix to use for each key in the provenance dictionary.

sepstr, optional

Separator to use to represent hierarchy. Must be a single character. Can not be a number, letter, or underscore (to avoid confusion with provenance keys themselves).

simple_typesbool, optional

If True only simple Python types will be used in the returned dictionary, specifically UUIDs will be returned as str. If False, UUIDs will be returned as uuid.UUID. Complex types found in DatasetProvenance.extras will be cast to a str if True.

use_upperbool or None, optional

If None the case of the keys matches the case of the first character of the prefix (defined by whether str.isupper() returns true, else they will be lower case). If False the case will be lower case, and if True the case will be upper case.

Returns:
provdict

Dictionary representing the provenance. The keys are defined in the notes below.

Raises:
ValueError

Raised if the separator is not a single character.

Notes

Keys from the given dataset (all optional if no dataset is given):

Id:

UUID of the given dataset.

Run:

Run of the given dataset.

Datasettype:

Dataset type of the given dataset.

Dataid x:

An entry for each required dimension, “x”, in the data ID.

Each input dataset will have the id, run, and datasettype keys as defined above (but no dataid key) with an input N prefix where N starts counting at 0.

The quantum ID, if present, will use key quantum.

Examples

>>> provenance.to_flat_dict(
...     ref, prefix="lsst.butler", sep=".", simple_types=True
... )
{
    "lsst.butler.id": "ae0fa83d-cc89-41dd-9680-f97ede49f01e",
    "lsst.butler.run": "test_run",
    "lsst.butler.datasettype": "data",
    "lsst.butler.dataid.detector": 10,
    "lsst.butler.dataid.instrument": "LSSTCam",
    "lsst.butler.quantum": "d93a735b-08f0-477d-bc95-2cc32d6d898b",
    "lsst.butler.input.0.id": "3dfd7ba5-5e35-4565-9d87-4b33880ed06c",
    "lsst.butler.input.0.run": "other_run",
    "lsst.butler.input.0.datasettype": "astropy_parquet",
    "lsst.butler.input.1.id": "7a99f6e9-4035-3d68-842e-58ecce1dc935",
    "lsst.butler.input.1.run": "other_run",
    "lsst.butler.input.1.datasettype": "astropy_parquet",
}