DatasetRef¶
- class lsst.daf.butler.DatasetRef(datasetType: DatasetType, dataId: DataCoordinate, run: str, *, id: UUID | None = None, conform: bool = True, id_generation_mode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE, datastore_records: Mapping[str, list[lsst.daf.butler.datastore.stored_file_info.StoredDatastoreItemInfo]] | None = None)¶
Bases:
object
Reference to a Dataset in a
Registry
.A
DatasetRef
may point to a Dataset that currently does not yet exist (e.g., because it is a predicted input for provenance).- Parameters:
- datasetType
DatasetType
The
DatasetType
for this Dataset.- dataId
DataCoordinate
A mapping of dimensions that labels the Dataset within a Collection.
- run
str
The name of the run this dataset was associated with when it was created.
- id
DatasetId
, optional The unique identifier assigned when the dataset is created. If
id
is not specified, a new unique ID will be created.- conform
bool
, optional If
True
(default), callDataCoordinate.standardize
to ensure that the data ID’s dimensions are consistent with the dataset type’s.DatasetRef
instances for which those dimensions are not equal should not be created in new code, but are still supported for backwards compatibility. New code should only passFalse
if it can guarantee that the dimensions are already consistent.- id_generation_mode
DatasetIdGenEnum
ID generation option.
UNIQUE
makes a random UUID4-type ID.DATAID_TYPE
makes a deterministic UUID5-type ID based on a dataset type name anddataId
.DATAID_TYPE_RUN
makes a deterministic UUID5-type ID based on a dataset type name, run collection name, anddataId
.- datastore_records
DatasetDatastoreRecords
orNone
Datastore records to attach.
- datasetType
Notes
See also Organizing and identifying datasets
Attributes Summary
A mapping of
Dimension
primary key values that labels the dataset within a Collection (DataCoordinate
).The definition of this dataset (
DatasetType
).Dimensions associated with the underlying
DatasetType
.Primary key of the dataset (
DatasetId
).The name of the run that produced the dataset.
Methods Summary
expanded
(dataId)Return a new
DatasetRef
with the given expanded data ID.from_json
(json_str[, universe, registry])Convert from JSON to a pydantic model.
from_simple
(simple[, universe, registry, ...])Construct a new object from simplified form.
groupByType
(refs)Group an iterable of
DatasetRef
byDatasetType
.Indicate whether this
DatasetRef
refers to a component.Boolean indicating whether this
DatasetRef
is a composite type.is_compatible_with
(other)Determine if the given
DatasetRef
is compatible with this one.iter_by_type
(refs)Group an iterable of
DatasetRef
byDatasetType
with special hooks for custom iterables that can do this efficiently.makeComponentRef
(name)Create a
DatasetRef
that corresponds to a component.Create a
DatasetRef
of the composite from a component ref.overrideStorageClass
(storageClass)Create a new
DatasetRef
from this one, but with a modifiedDatasetType
that has a differentStorageClass
.replace
(*[, id, run, storage_class, ...])Create a new
DatasetRef
from this one, but with some modified attributes.to_json
([minimal])Convert this class to JSON assuming that the
to_simple()
returns a pydantic model.to_simple
([minimal])Convert this class to a simple python type.
Attributes Documentation
- dataId: DataCoordinate¶
A mapping of
Dimension
primary key values that labels the dataset within a Collection (DataCoordinate
).Cannot be changed after a
DatasetRef
is constructed.
- datasetType: DatasetType¶
The definition of this dataset (
DatasetType
).Cannot be changed after a
DatasetRef
is constructed.
- dimensions¶
Dimensions associated with the underlying
DatasetType
.
- id¶
Primary key of the dataset (
DatasetId
).Cannot be changed after a
DatasetRef
is constructed.
- run: str¶
The name of the run that produced the dataset.
Cannot be changed after a
DatasetRef
is constructed.
Methods Documentation
- expanded(dataId: DataCoordinate) DatasetRef ¶
Return a new
DatasetRef
with the given expanded data ID.- Parameters:
- dataId
DataCoordinate
Data ID for the new
DatasetRef
. Must compare equal to the original data ID.
- dataId
- Returns:
- ref
DatasetRef
A new
DatasetRef
with the given data ID.
- ref
- classmethod from_json(json_str: str, universe: DimensionUniverse | None = None, registry: Registry | None = None) SupportsSimple ¶
Convert from JSON to a pydantic model.
- Parameters:
- cls_
type
ofSupportsSimple
The Python type being created.
- json_str
str
The JSON string representing this object.
- universe
DimensionUniverse
orNone
, optional The universe required to instantiate some models. Required if
registry
isNone
.- registry
Registry
orNone
, optional Registry from which to obtain the dimension universe if an explicit universe has not been given.
- cls_
- Returns:
- model
SupportsSimple
Pydantic model constructed from JSON and validated.
- model
- classmethod from_simple(simple: SerializedDatasetRef, universe: DimensionUniverse | None = None, registry: Registry | None = None, datasetType: DatasetType | None = None) DatasetRef ¶
Construct a new object from simplified form.
Generally this is data returned from the
to_simple
method.- Parameters:
- simple
dict
of [str
,Any
] The value returned by
to_simple()
.- universe
DimensionUniverse
The special graph of all known dimensions. Can be
None
if a registry is provided.- registry
lsst.daf.butler.Registry
, optional Registry to use to convert simple form of a DatasetRef to a full
DatasetRef
. Can beNone
if a full description of the type is provided along with a universe.- datasetTypeDatasetType, optional
If datasetType is supplied, this will be used as the datasetType object in the resulting DatasetRef instead of being read from the
SerializedDatasetRef
. This is useful when many refs share the same type as memory can be saved. Defaults to None.
- simple
- Returns:
- ref
DatasetRef
Newly-constructed object.
- ref
- static groupByType(refs: Iterable[DatasetRef]) NamedKeyDict[DatasetType, list[lsst.daf.butler._dataset_ref.DatasetRef]] ¶
Group an iterable of
DatasetRef
byDatasetType
.- Parameters:
- refs
Iterable
[DatasetRef
] DatasetRef
instances to group.
- refs
- Returns:
- grouped
NamedKeyDict
[DatasetType
,list
[DatasetRef
] ] Grouped
DatasetRef
instances.
- grouped
Notes
When lazy item-iterables are acceptable instead of a full mapping,
iter_by_type
can in some cases be far more efficient.
- isComponent() bool ¶
Indicate whether this
DatasetRef
refers to a component.- Returns:
- isComponent
bool
True
if thisDatasetRef
is a component,False
otherwise.
- isComponent
- isComposite() bool ¶
Boolean indicating whether this
DatasetRef
is a composite type.- Returns:
- isComposite
bool
True
if thisDatasetRef
is a composite type,False
otherwise.
- isComposite
- is_compatible_with(other: DatasetRef) bool ¶
Determine if the given
DatasetRef
is compatible with this one.- Parameters:
- other
DatasetRef
Dataset ref to check.
- other
- Returns:
Notes
Compatibility requires that the dataId and dataset ID match and the
DatasetType
is compatible. Compatibility is defined as the storage class associated with the dataset type of the other ref can be converted to this storage class.Specifically this means that if you have done:
new_ref = ref.overrideStorageClass(sc)
and this is successful, then the guarantee is that:
assert ref.is_compatible_with(new_ref) is True
since we know that the python type associated with the new ref can be converted to the original python type. The reverse is not guaranteed and depends on whether bidirectional converters have been registered.
- static iter_by_type(refs: Iterable[DatasetRef]) Iterable[tuple[lsst.daf.butler._dataset_type.DatasetType, collections.abc.Iterable[lsst.daf.butler._dataset_ref.DatasetRef]]] ¶
Group an iterable of
DatasetRef
byDatasetType
with special hooks for custom iterables that can do this efficiently.- Parameters:
- refs
Iterable
[DatasetRef
] DatasetRef
instances to group. If this satisfies the_DatasetRefGroupedIterable
protocol, its_iter_by_dataset_type
method will be called.
- refs
- Returns:
- grouped
Iterable
[tuple
[DatasetType
,Iterable
[DatasetRef
] ]] Grouped
DatasetRef
instances.
- grouped
- makeComponentRef(name: str) DatasetRef ¶
Create a
DatasetRef
that corresponds to a component.- Parameters:
- name
str
Name of the component.
- name
- Returns:
- ref
DatasetRef
A
DatasetRef
with a dataset type that corresponds to the given component, and the same ID and run (which may beNone
, if they areNone
inself
).
- ref
- makeCompositeRef() DatasetRef ¶
Create a
DatasetRef
of the composite from a component ref.Requires that this
DatasetRef
is a component.- Returns:
- ref
DatasetRef
A
DatasetRef
with a dataset type that corresponds to the composite parent of this component, and the same ID and run (which may beNone
, if they areNone
inself
).
- ref
- overrideStorageClass(storageClass: str | StorageClass) DatasetRef ¶
Create a new
DatasetRef
from this one, but with a modifiedDatasetType
that has a differentStorageClass
.- Parameters:
- storageClass
str
orStorageClass
The new storage class.
- storageClass
- Returns:
- modified
DatasetRef
A new dataset reference that is the same as the current one but with a different storage class in the
DatasetType
.
- modified
- replace(*, id: DatasetId | None = None, run: str | None = None, storage_class: str | StorageClass | None = None, datastore_records: DatasetDatastoreRecords | None | Literal[False] = False) DatasetRef ¶
Create a new
DatasetRef
from this one, but with some modified attributes.- Parameters:
- id
DatasetId
orNone
If not
None
then update dataset ID.- run
str
orNone
If not
None
then update run collection name. Ifdataset_id
isNone
then this will also cause new dataset ID to be generated.- storage_class
str
orStorageClass
orNone
The new storage class. If not
None
, replaces existing storage class.- datastore_records
DatasetDatastoreRecords
orNone
New datastore records. If
None
remove all records. By default datastore records are preserved.
- id
- Returns:
- modified
DatasetRef
A new dataset reference with updated attributes.
- modified
- to_json(minimal: bool = False) str ¶
Convert this class to JSON assuming that the
to_simple()
returns a pydantic model.- Parameters:
- minimal
bool
Return minimal possible representation.
- minimal
- to_simple(minimal: bool = False) SerializedDatasetRef ¶
Convert this class to a simple python type.
This makes it suitable for serialization.