DatasetRecordStorage¶

class lsst.daf.butler.registry.interfaces.DatasetRecordStorage(datasetType: lsst.daf.butler.core.datasets.type.DatasetType)¶

Bases: abc.ABC

An interface that manages the records associated with a particular DatasetType.

Parameters:	datasetType : `DatasetType` Dataset type whose records this object manages.

Methods Summary

`associate`(collection, datasets)	Associate one or more datasets with a collection.
`certify`(collection, datasets, timespan)	Associate one or more datasets with a calibration collection and a validity range within it.
`decertify`(collection, timespan, *, dataIds)	Remove or adjust datasets to clear a validity range within a calibration collection.
`delete`(datasets)	Fully delete the given datasets from the registry.
`disassociate`(collection, datasets)	Remove one or more datasets from a collection.
`find`(collection, dataId, timespan)	Search a collection for a dataset with the given data ID.
`insert`(run, dataIds)	Insert one or more dataset entries into the database.
`select`(collection, dataId, id, run, …)	Return a SQLAlchemy object that represents a `SELECT` query for this `DatasetType`.

Methods Documentation

associate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) → None¶

Associate one or more datasets with a collection.

Parameters:	collection : `CollectionRecord` The record object describing the collection. `collection.type` must be `TAGGED`. datasets : `Iterable` [ `DatasetRef` ] Datasets to be associated. All datasets must be resolved and have the same `DatasetType` as `self`.
Raises:	AmbiguousDatasetError Raised if any of the given `DatasetRef` instances is unresolved.

Notes

Associating a dataset with into collection that already contains a different dataset with the same DatasetType and data ID will remove the existing dataset from that collection.

Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases.

certify(collection: CollectionRecord, datasets: Iterable[DatasetRef], timespan: Timespan) → None¶

Associate one or more datasets with a calibration collection and a validity range within it.

Parameters:	collection : `CollectionRecord` The record object describing the collection. `collection.type` must be `CALIBRATION`. datasets : `Iterable` [ `DatasetRef` ] Datasets to be associated. All datasets must be resolved and have the same `DatasetType` as `self`. timespan : `Timespan` The validity range for these datasets within the collection.
Raises:	AmbiguousDatasetError Raised if any of the given `DatasetRef` instances is unresolved. ConflictingDefinitionError Raised if the collection already contains a different dataset with the same `DatasetType` and data ID and an overlapping validity range. TypeError Raised if `collection.type is not CollectionType.CALIBRATION` or if `self.datasetType.isCalibration() is False`.

decertify(collection: CollectionRecord, timespan: Timespan, *, dataIds: Optional[Iterable[DataCoordinate]] = None) → None¶

Remove or adjust datasets to clear a validity range within a calibration collection.

Parameters:

collection : CollectionRecord: The record object describing the collection. collection.type must be CALIBRATION.
timespan : Timespan: The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two.
dataIds : Iterable [ DataCoordinate ], optional: Data IDs that should be decertified within the given validity range If None, all data IDs for self.datasetType will be decertified.

Raises:

TypeError: Raised if collection.type is not CollectionType.CALIBRATION.

delete(datasets: Iterable[lsst.daf.butler.core.datasets.ref.DatasetRef]) → None¶

Fully delete the given datasets from the registry.

Parameters:	datasets : `Iterable` [ `DatasetRef` ] Datasets to be deleted. All datasets must be resolved and have the same `DatasetType` as `self`.
Raises:	AmbiguousDatasetError Raised if any of the given `DatasetRef` instances is unresolved.

disassociate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) → None¶

Remove one or more datasets from a collection.

Parameters:	collection : `CollectionRecord` The record object describing the collection. `collection.type` must be `TAGGED`. datasets : `Iterable` [ `DatasetRef` ] Datasets to be disassociated. All datasets must be resolved and have the same `DatasetType` as `self`.
Raises:	AmbiguousDatasetError Raised if any of the given `DatasetRef` instances is unresolved.

find(collection: CollectionRecord, dataId: DataCoordinate, timespan: Optional[Timespan] = None) → Optional[DatasetRef]¶

Search a collection for a dataset with the given data ID.

Parameters:

collection : CollectionRecord: The record object describing the collection to search for the dataset. May have any CollectionType.
dataId: `DataCoordinate`: Complete (but not necessarily expanded) data ID to search with, with dataId.graph == self.datasetType.dimensions.
timespan : Timespan, optional: A timespan that the validity range of the dataset must overlap. Required if collection.type is CollectionType.CALIBRATION, and ignored otherwise.

Returns:

ref : DatasetRef: A resolved DatasetRef (without components populated), or None if no matching dataset was found.

insert(run: RunRecord, dataIds: Iterable[DataCoordinate]) → Iterator[DatasetRef]¶

Insert one or more dataset entries into the database.

Parameters:	run : `RunRecord` The record object describing the `RUN` collection this dataset will be associated with. dataIds : `Iterable` [ `DataCoordinate` ] Expanded data IDs (`DataCoordinate` instances) for the datasets to be added. The dimensions of all data IDs must be the same as `self.datasetType.dimensions`.
Returns:	datasets : `Iterable` [ `DatasetRef` ] References to the inserted datasets.

select(collection: CollectionRecord, dataId: SimpleQuery.Select.Or[DataCoordinate] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, id: SimpleQuery.Select.Or[Optional[int]] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, run: SimpleQuery.Select.Or[None] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, timespan: SimpleQuery.Select.Or[Optional[Timespan]] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, ingestDate: SimpleQuery.Select.Or[Optional[Timespan]] = None) → Optional[SimpleQuery]¶

Return a SQLAlchemy object that represents a SELECT query for this DatasetType.

All arguments can either be a value that constrains the query or the SimpleQuery.Select tag object to indicate that the value should be returned in the columns in the SELECT clause. The default is SimpleQuery.Select.

Parameters:

collection : CollectionRecord: The record object describing the collection to query. May not be of type CollectionType.CHAINED.
dataId : DataCoordinate or Select: The data ID to restrict results with, or an instruction to return the data ID via columns with names self.datasetType.dimensions.names.
id : int, Select or None,: The integer primary key value for the dataset, an instruction to return it via a id column, or None to ignore it entirely.
run : None or Select: If Select (default), include the dataset’s run key value (as column labeled with the return value of CollectionManager.getRunForiegnKeyName). If None, do not include this column (to constrain the run, pass a RunRecord as the collection argument instead).
timespan : None, Select, or Timespan: If Select (default), include the validity range timespan in the result columns. If a Timespan instance, constrain the results to those whose validity ranges overlap that given timespan. Ignored unless collection.type is CollectionType.CALIBRATION.
ingestDate : None, Select, or Timespan: If Select include the ingest timestamp in the result columns. If a Timespan instance, constrain the results to those whose ingest times which are inside given timespan and also include timestamp in the result columns. If None (default) then there is no constraint and timestamp is not returned.

Returns:

query : SimpleQuery or None: A struct containing the SQLAlchemy object that representing a simple SELECT query, or None if it is known that there are no datasets of this DatasetType that match the given constraints.

Navigation

DatasetRecordStorage¶