DatasetRef¶

class lsst.daf.butler.DatasetRef¶

Bases: object

Reference to a Dataset in a Registry.

A DatasetRef may point to a Dataset that currently does not yet exist (e.g., because it is a predicted input for provenance).

Parameters:

datasetType : DatasetType: The DatasetType for this Dataset.
dataId : DataCoordinate: A mapping of dimensions that labels the Dataset within a Collection.
id : int, optional: The unique integer identifier assigned when the dataset is created.
run : str, optional: The name of the run this dataset was associated with when it was created. Must be provided if id is.
hash : bytes, optional: A hash of the dataset type and data ID. Should only be provided if copying from another DatasetRef with the same dataset type and data ID.
components : dict, optional: A dictionary mapping component name to a DatasetRef for that component. Should not be passed unless id is also provided (i.e. if this is a “resolved” reference).
conform : bool, optional: If True (default), call DataCoordinate.standardize to ensure that the data ID’s dimensions are consistent with the dataset type’s. DatasetRef instances for which those dimensions are not equal should not be created in new code, but are still supported for backwards compatibility. New code should only pass False if it can guarantee that the dimensions are already consistent.
hasParentId : bool, optional: If True this DatasetRef is a component that has the id of the composite parent. This is set if the registry does not know about individual components but does know about the composite.

Raises:

ValueError: Raised if run or components is provided but id is not, or if a component dataset is inconsistent with the storage class, or if id is provided but run is not.

Attributes Summary

`components`	Named `DatasetRef` components (`Mapping` or `None`).
`dataId`	A mapping of `Dimension` primary key values that labels the dataset within a Collection (`DataCoordinate`).
`datasetType`	The definition of this dataset (`DatasetType`).
`dimensions`	The dimensions associated with the underlying `DatasetType`
`hasParentId`
`hash`	Secure hash of the `DatasetType` name and data ID (`bytes`).
`id`	Primary key of the dataset (`int` or `None`).
`run`	The name of the run that produced the dataset.

Methods Summary

`expanded`(dataId)	Return a new `DatasetRef` with the given expanded data ID.
`flatten`(refs, *, parents)	Recursively transform an iterable over `DatasetRef` to include nested component `DatasetRef` instances.
`getCheckedId`()	Return `self.id`, or raise if it is `None`.
`groupByType`(refs, *, recursive)	Group an iterable of `DatasetRef` by `DatasetType`.
`isComponent`()	Boolean indicating whether this `DatasetRef` refers to a component of a composite.
`isComposite`()	Boolean indicating whether this `DatasetRef` is a composite type.
`resolved`(id, run, components, …)	Return a new `DatasetRef` with the same data ID and dataset type and the given ID and run.
`unresolved`()	Return a new `DatasetRef` with the same data ID and dataset type, but no ID, run, or components.

Attributes Documentation

components¶

Named DatasetRef components (Mapping or None).

For resolved DatasetRef instances, this is a read-only mapping. For unresolved instances, this is always None.

dataId¶

A mapping of Dimension primary key values that labels the dataset within a Collection (DataCoordinate).

Cannot be changed after a DatasetRef is constructed.

datasetType¶

The definition of this dataset (DatasetType).

Cannot be changed after a DatasetRef is constructed.

dimensions¶: The dimensions associated with the underlying DatasetType

hasParentId¶

hash¶: Secure hash of the DatasetType name and data ID (bytes).

id¶

Primary key of the dataset (int or None).

Cannot be changed after a DatasetRef is constructed; use resolved or unresolved to add or remove this information when creating a new DatasetRef.

run¶

The name of the run that produced the dataset.

Cannot be changed after a DatasetRef is constructed; use resolved or unresolved to add or remove this information when creating a new DatasetRef.

Methods Documentation

expanded(dataId: lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate) → lsst.daf.butler.core.datasets.ref.DatasetRef¶

Return a new DatasetRef with the given expanded data ID.

Parameters:	dataId : `ExpandedDataCoordinate` Data ID for the new `DatasetRef`. Must compare equal to the original data ID.
Returns:	ref : `DatasetRef` A new `DatasetRef` with the given data ID.

static flatten(refs: Iterable[lsst.daf.butler.core.datasets.ref.DatasetRef], *, parents: bool = True) → Iterator[lsst.daf.butler.core.datasets.ref.DatasetRef]¶

Recursively transform an iterable over DatasetRef to include nested component DatasetRef instances.

Parameters:

refs : Iterable [ DatasetRef ]: Input iterable to process. Must contain only resolved DatasetRef instances (i.e. with DatasetRef.components not None).
parents : bool, optional: If True (default) include the given datasets in the output iterable. If False, include only their components. This does not propagate recursively - only the outermost level of parents is ignored if parents is False.

Yields:

ref : DatasetRef: Either one of the given DatasetRef instances (only if parent is True) or on of its (recursive) children.

Notes

If parents is True, components are guaranteed to be yielded before their parents.

getCheckedId() → int¶

Return self.id, or raise if it is None.

This trivial method exists to allow operations that would otherwise be natural list comprehensions to check that the ID is not None as well.

Returns:	id : `int` `self.id` if it is not `None`.
Raises:	AmbiguousDatasetError Raised if `ref.id` is `None`.

static groupByType(refs: Iterable[lsst.daf.butler.core.datasets.ref.DatasetRef], *, recursive: bool = True) → lsst.daf.butler.core.utils.NamedKeyDict[lsst.daf.butler.core.datasets.type.DatasetType, typing.List[lsst.daf.butler.core.datasets.ref.DatasetRef]][lsst.daf.butler.core.datasets.type.DatasetType, List[lsst.daf.butler.core.datasets.ref.DatasetRef]]¶

Group an iterable of DatasetRef by DatasetType.

Parameters:	refs : `Iterable` [ `DatasetRef` ] `DatasetRef` instances to group. recursive : `bool`, optional If `True` (default), also group any `DatasetRef` instances found in the `DatasetRef.components` dictionaries of `refs`, recursively. `True` also checks that references are “resolved” (unresolved references never have components).
Returns:	grouped : `NamedKeyDict` [ `DatasetType`, `list` [ `DatasetRef` ] ] Grouped `DatasetRef` instances.
Raises:	AmbiguousDatasetError Raised if `recursive is True`, and one or more refs has `DatasetRef.components is None` (as is always the case for unresolved `DatasetRef` objects).

isComponent() → bool¶

Boolean indicating whether this DatasetRef refers to a component of a composite.

Returns:	isComponent : `bool` `True` if this `DatasetRef` is a component, `False` otherwise.

isComposite() → bool¶

Boolean indicating whether this DatasetRef is a composite type.

Returns:	isComposite : `bool` `True` if this `DatasetRef` is a composite type, `False` otherwise.

resolved(id: int, run: str, components: Optional[Mapping[str, lsst.daf.butler.core.datasets.ref.DatasetRef]] = None) → lsst.daf.butler.core.datasets.ref.DatasetRef¶

Return a new DatasetRef with the same data ID and dataset type and the given ID and run.

Parameters:

id : int: The unique integer identifier assigned when the dataset is created.
run : str: The run this dataset was associated with when it was created.
components : dict, optional: A dictionary mapping component name to a DatasetRef for that component. If self is already a resolved DatasetRef, its components will be merged with this dictionary, with this dictionary taking precedence.

Returns:

ref : DatasetRef: A new DatasetRef.

unresolved() → lsst.daf.butler.core.datasets.ref.DatasetRef¶

Return a new DatasetRef with the same data ID and dataset type, but no ID, run, or components.

Returns:	ref : `DatasetRef` A new `DatasetRef`.

Notes

This can be used to compare only the data ID and dataset type of a pair of DatasetRef instances, regardless of whether either is resolved:

if ref1.unresolved() == ref2.unresolved():
    ...

Navigation

DatasetRef¶