DatasetRecordStorage¶
- class lsst.daf.butler.registry.interfaces.DatasetRecordStorage(datasetType: DatasetType)¶
Bases:
ABCAn interface that manages the records associated with a particular
DatasetType.- Parameters:
- datasetType
DatasetType Dataset type whose records this object manages.
- datasetType
Methods Summary
associate(collection, datasets)Associate one or more datasets with a collection.
certify(collection, datasets, timespan, context)Associate one or more datasets with a calibration collection and a validity range within it.
decertify(collection, timespan, *[, dataIds])Remove or adjust datasets to clear a validity range within a calibration collection.
delete(datasets)Fully delete the given datasets from the registry.
disassociate(collection, datasets)Remove one or more datasets from a collection.
import_(run, datasets)Insert one or more dataset entries into the database.
insert(run, dataIds[, idGenerationMode])Insert one or more dataset entries into the database.
make_query_joiner(collections, fields)Make a
direct_query_driver.QueryJoinerthat represents a search for datasets of this type.make_relation(*collections, columns, context)Return a
sql.Relationthat represents a query for for thisDatasetTypein one or more collections.Make sure that collection summaries for this dataset type are consistent with the contents of the dataset tables.
Methods Documentation
- abstract associate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
Associate one or more datasets with a collection.
- Parameters:
- collection
CollectionRecord The record object describing the collection.
collection.typemust beTAGGED.- datasets
Iterable[DatasetRef] Datasets to be associated. All datasets must be resolved and have the same
DatasetTypeasself.
- collection
- Raises:
- AmbiguousDatasetError
Raised if any of the given
DatasetRefinstances is unresolved.
Notes
Associating a dataset with into collection that already contains a different dataset with the same
DatasetTypeand data ID will remove the existing dataset from that collection.Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases.
- abstract certify(collection: CollectionRecord, datasets: Iterable[DatasetRef], timespan: Timespan, context: SqlQueryContext) None¶
Associate one or more datasets with a calibration collection and a validity range within it.
- Parameters:
- collection
CollectionRecord The record object describing the collection.
collection.typemust beCALIBRATION.- datasets
Iterable[DatasetRef] Datasets to be associated. All datasets must be resolved and have the same
DatasetTypeasself.- timespan
Timespan The validity range for these datasets within the collection.
- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- collection
- Raises:
- AmbiguousDatasetError
Raised if any of the given
DatasetRefinstances is unresolved.- ConflictingDefinitionError
Raised if the collection already contains a different dataset with the same
DatasetTypeand data ID and an overlapping validity range.- CollectionTypeError
Raised if
collection.type is not CollectionType.CALIBRATIONor ifself.datasetType.isCalibration() is False.
- abstract decertify(collection: CollectionRecord, timespan: Timespan, *, dataIds: Iterable[DataCoordinate] | None = None, context: SqlQueryContext) None¶
Remove or adjust datasets to clear a validity range within a calibration collection.
- Parameters:
- collection
CollectionRecord The record object describing the collection.
collection.typemust beCALIBRATION.- timespan
Timespan The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two.
- dataIds
Iterable[DataCoordinate], optional Data IDs that should be decertified within the given validity range If
None, all data IDs forself.datasetTypewill be decertified.- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- collection
- Raises:
- CollectionTypeError
Raised if
collection.type is not CollectionType.CALIBRATION.
- abstract delete(datasets: Iterable[DatasetRef]) None¶
Fully delete the given datasets from the registry.
- Parameters:
- datasets
Iterable[DatasetRef] Datasets to be deleted. All datasets must be resolved and have the same
DatasetTypeasself.
- datasets
- Raises:
- AmbiguousDatasetError
Raised if any of the given
DatasetRefinstances is unresolved.
- abstract disassociate(collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
Remove one or more datasets from a collection.
- Parameters:
- collection
CollectionRecord The record object describing the collection.
collection.typemust beTAGGED.- datasets
Iterable[DatasetRef] Datasets to be disassociated. All datasets must be resolved and have the same
DatasetTypeasself.
- collection
- Raises:
- AmbiguousDatasetError
Raised if any of the given
DatasetRefinstances is unresolved.
- abstract import_(run: RunRecord, datasets: Iterable[DatasetRef]) Iterator[DatasetRef]¶
Insert one or more dataset entries into the database.
- Parameters:
- run
RunRecord The record object describing the
RUNcollection this dataset will be associated with.- datasets
IterableofDatasetRef Datasets to be inserted. Datasets can specify
idattribute which will be used for inserted datasets. All dataset IDs must have the same type (intoruuid.UUID), if type of dataset IDs does not match type supported by this class then IDs will be ignored and new IDs will be generated by backend.
- run
- Returns:
- datasets
Iterable[DatasetRef] References to the inserted or existing datasets.
- datasets
Notes
The
datasetTypeandrunattributes of datasets are supposed to be identical across all datasets but this is not checked and it should be enforced by higher level registry code. This method does not need to use those attributes from datasets, onlydataIdandidare relevant.
- abstract insert(run: RunRecord, dataIds: Iterable[DataCoordinate], idGenerationMode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE) Iterator[DatasetRef]¶
Insert one or more dataset entries into the database.
- Parameters:
- run
RunRecord The record object describing the
RUNcollection this dataset will be associated with.- dataIds
Iterable[DataCoordinate] Expanded data IDs (
DataCoordinateinstances) for the datasets to be added. The dimensions of all data IDs must be the same asself.datasetType.dimensions.- idGenerationMode
DatasetIdGenEnum With
UNIQUEeach new dataset is inserted with its new unique ID. With non-UNIQUEmode ID is computed from some combination of dataset type, dataId, and run collection name; if the same ID is already in the database then new record is not inserted.
- run
- Returns:
- datasets
Iterable[DatasetRef] References to the inserted datasets.
- datasets
- abstract make_query_joiner(collections: Sequence[CollectionRecord], fields: Set[str]) QueryJoiner¶
Make a
direct_query_driver.QueryJoinerthat represents a search for datasets of this type.- Parameters:
- collections
Sequence[CollectionRecord] Collections to search, in order, after filtering out collections with no datasets of this type via collection summaries.
- fields
Set[str] Names of fields to make available in the joiner. Options include:
dataset_id(UUID)run(collection name,str)collection(collection name,str)collection_key(collection primary key, manager-dependent)timespan(validity range, or unbounded for non-calibrations)ingest_date(time dataset was ingested into repository)
Dimension keys for the dataset type’s required dimensions are always included.
- collections
- Returns:
- joiner
direct_query_driver.QueryJoiner A query-construction object representing a table or subquery. If
fieldsis empty orlen(collections) <= 1, this is guaranteed to have rows that are unique over dimension keys.
- joiner
- abstract make_relation(*collections: CollectionRecord, columns: Set[str], context: SqlQueryContext) Relation¶
Return a
sql.Relationthat represents a query for for thisDatasetTypein one or more collections.- Parameters:
- *collections
CollectionRecord The record object(s) describing the collection(s) to query. May not be of type
CollectionType.CHAINED. If multiple collections are passed, the query will search all of them in an unspecified order, and all collections must have the same type. Must include at least one collection.- columns
Set[str] Columns to include in the relation. See
Query.find_datasetsfor most options, but this method supports one more:rank: a calculated integer column holding the index of the collection the dataset was found in, within thecollectionssequence given.
- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- *collections
- Returns:
- relation
Relation Representation of the query.
- relation