DatasetRecordStorage¶
- class lsst.daf.butler.registry.interfaces.DatasetRecordStorage(datasetType: DatasetType)¶
- Bases: - ABC- An interface that manages the records associated with a particular - DatasetType.- Parameters:
- datasetTypeDatasetType
- Dataset type whose records this object manages. 
 
- datasetType
 - Methods Summary - associate(collection, datasets)- Associate one or more datasets with a collection. - certify(collection, datasets, timespan)- Associate one or more datasets with a calibration collection and a validity range within it. - decertify(collection, timespan, *[, dataIds])- Remove or adjust datasets to clear a validity range within a calibration collection. - delete(datasets)- Fully delete the given datasets from the registry. - disassociate(collection, datasets)- Remove one or more datasets from a collection. - find(collection, dataId[, timespan])- Search a collection for a dataset with the given data ID. - import_(run, datasets[, idGenerationMode, ...])- Insert one or more dataset entries into the database. - insert(run, dataIds[, idGenerationMode])- Insert one or more dataset entries into the database. - select(*collections[, dataId, id, run, ...])- Return a SQLAlchemy object that represents a - SELECTquery for this- DatasetType.- Methods Documentation - abstract associate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
- Associate one or more datasets with a collection. - Parameters:
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- TAGGED.
- datasetsIterable[DatasetRef]
- Datasets to be associated. All datasets must be resolved and have the same - DatasetTypeas- self.
 
- collection
- Raises:
- AmbiguousDatasetError
- Raised if any of the given - DatasetRefinstances is unresolved.
 
 - Notes - Associating a dataset with into collection that already contains a different dataset with the same - DatasetTypeand data ID will remove the existing dataset from that collection.- Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases. 
 - abstract certify(collection: CollectionRecord, datasets: Iterable[DatasetRef], timespan: Timespan) None¶
- Associate one or more datasets with a calibration collection and a validity range within it. - Parameters:
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- CALIBRATION.
- datasetsIterable[DatasetRef]
- Datasets to be associated. All datasets must be resolved and have the same - DatasetTypeas- self.
- timespanTimespan
- The validity range for these datasets within the collection. 
 
- collection
- Raises:
- AmbiguousDatasetError
- Raised if any of the given - DatasetRefinstances is unresolved.
- ConflictingDefinitionError
- Raised if the collection already contains a different dataset with the same - DatasetTypeand data ID and an overlapping validity range.
- CollectionTypeError
- Raised if - collection.type is not CollectionType.CALIBRATIONor if- self.datasetType.isCalibration() is False.
 
 
 - abstract decertify(collection: CollectionRecord, timespan: Timespan, *, dataIds: Iterable[DataCoordinate] | None = None) None¶
- Remove or adjust datasets to clear a validity range within a calibration collection. - Parameters:
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- CALIBRATION.
- timespanTimespan
- The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two. 
- dataIdsIterable[DataCoordinate], optional
- Data IDs that should be decertified within the given validity range If - None, all data IDs for- self.datasetTypewill be decertified.
 
- collection
- Raises:
- CollectionTypeError
- Raised if - collection.type is not CollectionType.CALIBRATION.
 
 
 - abstract delete(datasets: Iterable[DatasetRef]) None¶
- Fully delete the given datasets from the registry. - Parameters:
- datasetsIterable[DatasetRef]
- Datasets to be deleted. All datasets must be resolved and have the same - DatasetTypeas- self.
 
- datasets
- Raises:
- AmbiguousDatasetError
- Raised if any of the given - DatasetRefinstances is unresolved.
 
 
 - abstract disassociate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
- Remove one or more datasets from a collection. - Parameters:
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- TAGGED.
- datasetsIterable[DatasetRef]
- Datasets to be disassociated. All datasets must be resolved and have the same - DatasetTypeas- self.
 
- collection
- Raises:
- AmbiguousDatasetError
- Raised if any of the given - DatasetRefinstances is unresolved.
 
 
 - abstract find(collection: CollectionRecord, dataId: DataCoordinate, timespan: Timespan | None = None) DatasetRef | None¶
- Search a collection for a dataset with the given data ID. - Parameters:
- collectionCollectionRecord
- The record object describing the collection to search for the dataset. May have any - CollectionType.
- dataId: `DataCoordinate`
- Complete (but not necessarily expanded) data ID to search with, with - dataId.graph == self.datasetType.dimensions.
- timespanTimespan, optional
- A timespan that the validity range of the dataset must overlap. Required if - collection.type is CollectionType.CALIBRATION, and ignored otherwise.
 
- collection
- Returns:
- refDatasetRef
- A resolved - DatasetRef(without components populated), or- Noneif no matching dataset was found.
 
- ref
 
 - abstract import_(run: RunRecord, datasets: Iterable[DatasetRef], idGenerationMode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE, reuseIds: bool = False) Iterator[DatasetRef]¶
- Insert one or more dataset entries into the database. - Parameters:
- runRunRecord
- The record object describing the - RUNcollection this dataset will be associated with.
- datasetsIterableofDatasetRef
- Datasets to be inserted. Datasets can specify - idattribute which will be used for inserted datasets. All dataset IDs must have the same type (- intor- uuid.UUID), if type of dataset IDs does not match type supported by this class then IDs will be ignored and new IDs will be generated by backend.
- idGenerationModeDatasetIdGenEnum
- With - UNIQUEeach new dataset is inserted with its new unique ID. With non-- UNIQUEmode ID is computed from some combination of dataset type, dataId, and run collection name; if the same ID is already in the database then new record is not inserted.
- reuseIdsbool, optional
- If - Truethen forces re-use of imported dataset IDs for integer IDs which are normally generated as auto-incremented; exception will be raised if imported IDs clash with existing ones. This option has no effect on the use of globally-unique IDs which are always re-used (or generated if integer IDs are being imported).
 
- run
- Returns:
- datasetsIterable[DatasetRef]
- References to the inserted or existing datasets. 
 
- datasets
 - Notes - The - datasetTypeand- runattributes of datasets are supposed to be identical across all datasets but this is not checked and it should be enforced by higher level registry code. This method does not need to use those attributes from datasets, only- dataIdand- idare relevant.
 - abstract insert(run: RunRecord, dataIds: Iterable[DataCoordinate], idGenerationMode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE) Iterator[DatasetRef]¶
- Insert one or more dataset entries into the database. - Parameters:
- runRunRecord
- The record object describing the - RUNcollection this dataset will be associated with.
- dataIdsIterable[DataCoordinate]
- Expanded data IDs ( - DataCoordinateinstances) for the datasets to be added. The dimensions of all data IDs must be the same as- self.datasetType.dimensions.
- idModeDatasetIdGenEnum
- With - UNIQUEeach new dataset is inserted with its new unique ID. With non-- UNIQUEmode ID is computed from some combination of dataset type, dataId, and run collection name; if the same ID is already in the database then new record is not inserted.
 
- run
- Returns:
- datasetsIterable[DatasetRef]
- References to the inserted datasets. 
 
- datasets
 
 - abstract select(*collections: CollectionRecord, dataId: SimpleQuery.Select.Or[DataCoordinate] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, id: SimpleQuery.Select.Or[DatasetId | None] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, run: SimpleQuery.Select.Or[None] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, timespan: SimpleQuery.Select.Or[Timespan | None] = <class 'lsst.daf.butler.core.simpleQuery.SimpleQuery.Select'>, ingestDate: SimpleQuery.Select.Or[Timespan | None] = None, rank: SimpleQuery.Select.Or[None] = None) sqlalchemy.sql.Selectable¶
- Return a SQLAlchemy object that represents a - SELECTquery for this- DatasetType.- All arguments can either be a value that constrains the query or the - SimpleQuery.Selecttag object to indicate that the value should be returned in the columns in the- SELECTclause. The default is- SimpleQuery.Select.- Parameters:
- *collectionsCollectionRecord
- The record object(s) describing the collection(s) to query. May not be of type - CollectionType.CHAINED. If multiple collections are passed, the query will search all of them in an unspecified order, and all collections must have the same type.
- dataIdDataCoordinateorSelect
- The data ID to restrict results with, or an instruction to return the data ID via columns with names - self.datasetType.dimensions.names.
- idDatasetId,Selector None,
- The primary key value for the dataset, an instruction to return it via a - idcolumn, or- Noneto ignore it entirely.
- runNoneorSelect
- If - Select(default), include the dataset’s run key value (as column labeled with the return value of- CollectionManager.getRunForeignKeyName). If- None, do not include this column (to constrain the run, pass a- RunRecordas the- collectionargument instead).
- timespanNone,Select, orTimespan
- If - Select(default), include the validity range timespan in the result columns. If a- Timespaninstance, constrain the results to those whose validity ranges overlap that given timespan. For collections whose type is not- CALIBRATION, if- Selectis passed a column with a literal- NULLvalue will be added, and ``sqlalchemy.sql.expressions.Null` may be passed to force a constraint that the value be null (since- Noneis interpreted as meaning “do not select or constrain this column”).
- ingestDateNone,Select, orTimespan
- If - Selectinclude the ingest timestamp in the result columns. If a- Timespaninstance, constrain the results to those whose ingest times which are inside given timespan and also include timestamp in the result columns. If- None(default) then there is no constraint and timestamp is not returned.
- rankSelectorNone
- If - Select, include a calculated column that is the integer rank of the row’s collection in the given list of collections, starting from zero.
 
- *collections
- Returns:
- querysqlalchemy.sql.Selectable
- A SQLAlchemy object representing a simple - SELECTquery.
 
- query