DatasetRecordStorage

class lsst.daf.butler.registry.interfaces.DatasetRecordStorage(datasetType: lsst.daf.butler.core.datasets.type.DatasetType)

Bases: abc.ABC

An interface that manages the records associated with a particular DatasetType.

Parameters:
datasetType : DatasetType

Dataset type whose records this object manages.

Methods Summary

associate(collection, datasets) Associate one or more datasets with a collection.
disassociate(collection, datasets) Remove one or more datasets from a collection.
find(collection, dataId) Search a collection for a dataset with the given data ID.
insert(run, dataIds, *, quantum) Insert one or more dataset entries into the database.
select(collection, dataId, id, run) Return a SQLAlchemy object that represents a SELECT query for this DatasetType.

Methods Documentation

associate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) → None

Associate one or more datasets with a collection.

Parameters:
collection : CollectionRecord

The record object describing the collection. collection.type must be TAGGED.

datasets : Iterable [ DatasetRef ]

Datasets to be associated. All datasets must be resolved and have the same DatasetType as self.

Raises:
AmbiguousDatasetError

Raised if any of the given DatasetRef instances is unresolved.

Notes

Associating a dataset with into collection that already contains a different dataset with the same DatasetType and data ID will remove the existing dataset from that collection.

Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases.

disassociate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) → None

Remove one or more datasets from a collection.

Parameters:
collection : CollectionRecord

The record object describing the collection. collection.type must be TAGGED.

datasets : Iterable [ DatasetRef ]

Datasets to be disassociated. All datasets must be resolved and have the same DatasetType as self.

Raises:
AmbiguousDatasetError

Raised if any of the given DatasetRef instances is unresolved.

find(collection: CollectionRecord, dataId: DataCoordinate) → Optional[DatasetRef]

Search a collection for a dataset with the given data ID.

Parameters:
collection : CollectionRecord

The record object describing the collection to search for the dataset. May have any CollectionType.

dataId: `DataCoordinate`

Complete (but not necessarily expanded) data ID to search with, with dataId.graph == self.datasetType.dimensions.

Returns:
ref : DatasetRef or None

A resolved DatasetRef (without components populated), or None if no matching dataset was found.

insert(run: RunRecord, dataIds: Iterable[ExpandedDataCoordinate], *, quantum: Optional[Quantum] = None) → Iterator[DatasetRef]

Insert one or more dataset entries into the database.

Parameters:
run : RunRecord

The record object describing the RUN collection this dataset will be associated with.

dataIds : Iterable [ ExpandedDataCoordinate ]

Expanded data IDs (ExpandedDataCoordinate instances) for the datasets to be added. The dimensions of all data IDs must be the same as self.datasetType.dimensions.

quantum : Quantum, optional

The Quantum instance that should be recorded as responsible for producing this dataset.

Returns:
datasets : Iterable [ DatasetRef ]

References to the inserted datasets.

Notes

This method does not insert component datasets recursively, as those have a different DatasetType than their parent and hence are managed by a different DatasetRecordStorage instance.

select(collection: CollectionRecord, dataId: Select.Or[DataCoordinate] = <class 'lsst.daf.butler.registry.simpleQuery.Select'>, id: Select.Or[Optional[int]] = <class 'lsst.daf.butler.registry.simpleQuery.Select'>, run: Select.Or[None] = <class 'lsst.daf.butler.registry.simpleQuery.Select'>) → Optional[SimpleQuery]

Return a SQLAlchemy object that represents a SELECT query for this DatasetType.

All arguments can either be a value that constrains the query or the Select tag object to indicate that the value should be returned in the columns in the SELECT clause. The default is Select.

Parameters:
collection : CollectionRecord

The record object describing the collection to query. May not be of type CollectionType.CHAINED.

dataId : DataCoordinate or Select

The data ID to restrict results with, or an instruction to return the data ID via columns with names self.datasetType.dimensions.names.

id : int, Select or None,

The integer primary key value for the dataset, an instruction to return it via a id column, or None to ignore it entirely.

run : None or Select

If Select (default), include the dataset’s run key value (as column labeled with the return value of CollectionManager.getRunForiegnKeyName). If None, do not include this column (to constrain the run, pass a RunRecord as the collection argument instead.)

Returns:
query : SimpleQuery or None

A struct containing the SQLAlchemy object that representing a simple SELECT query, or None if it is known that there are no datasets of this DatasetType that match the given constraints.