DatasetRecordStorageManager¶
- class lsst.daf.butler.registry.interfaces.DatasetRecordStorageManager(*, registry_schema_version: VersionTuple | None = None)¶
Bases:
VersionedExtensionAn interface that manages the tables that describe datasets.
DatasetRecordStorageManagerprimarily serves as a container and factory forDatasetRecordStorageinstances, which each provide access to the records for a differentDatasetType.- Parameters:
- registry_schema_version
VersionTupleorNone, optional Version of registry schema.
- registry_schema_version
Methods Summary
addDatasetForeignKey(tableSpec, *[, name, ...])Add a foreign key (field and constraint) referencing the dataset table.
associate(dataset_type, collection, datasets)Associate one or more datasets with a collection.
certify(dataset_type, collection, datasets, ...)Associate one or more datasets with a calibration collection and a validity range within it.
clone(*, db, collections, dimensions, ...)Make an independent copy of this manager instance bound to new instances of
Databaseand other managers.conform_exact_dataset_type(dataset_type)Conform a value that may be a dataset type or dataset type name to just the dataset type name, while checking that the dataset type is not a component and (if a
DatasetTypeinstance is given) has the exact same definition in the registry.decertify(dataset_type, collection, timespan, *)Remove or adjust datasets to clear a validity range within a calibration collection.
delete(datasets)Fully delete the given datasets from the registry.
disassociate(dataset_type, collection, datasets)Remove one or more datasets from a collection.
fetch_summaries(collections[, dataset_types])Fetch collection summaries given their names and dataset types.
getCollectionSummary(collection)Return a summary for the given collection.
getDatasetRef(id)Return a
DatasetReffor the given dataset primary key value.get_dataset_type(name)Look up a dataset type by name.
import_(run, refs)Insert one or more dataset entries into the database.
Return type of the
ingest_datecolumn.initialize(db, context, *, collections, ...)Construct an instance of the manager.
insert(dataset_type_name, run, data_ids[, ...])Insert one or more dataset entries into the database.
make_joins_builder(dataset_type, ...[, is_union])Make a
direct_query_driver.SqlJoinsBuilderthat represents a search for datasets of this type.make_relation(dataset_type, *collections, ...)Return a
sql.Relationthat represents a query for thisDatasetTypein one or more collections.Fetch data from the database and use it to pre-populate caches to speed up later operations.
refresh()Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed.
refresh_collection_summaries(dataset_type)Make sure that collection summaries for this dataset type are consistent with the contents of the dataset tables.
register_dataset_type(dataset_type)Ensure that this
Registrycan hold records for the givenDatasetType, creating new tables as necessary.remove_dataset_type(name)Remove the dataset type.
resolve_wildcard(expression[, missing, ...])Resolve a dataset type wildcard expression.
Methods Documentation
- abstract classmethod addDatasetForeignKey(tableSpec: TableSpec, *, name: str = 'dataset', constraint: bool = True, onDelete: str | None = None, **kwargs: Any) FieldSpec¶
Add a foreign key (field and constraint) referencing the dataset table.
- Parameters:
- tableSpec
ddl.TableSpec Specification for the table that should reference the dataset table. Will be modified in place.
- name
str, optional A name to use for the prefix of the new field; the full name is
{name}_id.- constraint
bool, optional If
False(Trueis default), add a field that can be joined to the dataset primary key, but do not add a foreign key constraint.- onDelete
str, optional One of “CASCADE” or “SET NULL”, indicating what should happen to the referencing row if the collection row is deleted.
Noneindicates that this should be an integrity error.- **kwargs
Additional keyword arguments are forwarded to the
ddl.FieldSpecconstructor (only thenameanddtypearguments are otherwise provided).
- tableSpec
- Returns:
- idSpec
ddl.FieldSpec Specification for the ID field.
- idSpec
- abstract associate(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
Associate one or more datasets with a collection.
- Parameters:
- dataset_type
DatasetType Type of all datasets.
- collection
CollectionRecord The record object describing the collection.
collection.typemust beTAGGED.- datasets
Iterable[DatasetRef] Datasets to be associated. All datasets must have the same
DatasetTypeasdataset_type, but this is not checked.
- dataset_type
Notes
Associating a dataset into collection that already contains a different dataset with the same
DatasetTypeand data ID will remove the existing dataset from that collection.Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases.
- abstract certify(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef], timespan: Timespan, context: SqlQueryContext) None¶
Associate one or more datasets with a calibration collection and a validity range within it.
- Parameters:
- dataset_type
DatasetType Type of all datasets.
- collection
CollectionRecord The record object describing the collection.
collection.typemust beCALIBRATION.- datasets
Iterable[DatasetRef] Datasets to be associated. All datasets must have the same
DatasetTypeasdataset_type, but this is not checked.- timespan
Timespan The validity range for these datasets within the collection.
- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- dataset_type
- Raises:
- ConflictingDefinitionError
Raised if the collection already contains a different dataset with the same
DatasetTypeand data ID and an overlapping validity range.- DatasetTypeError
Raised if
dataset_type.isCalibration() is False.- CollectionTypeError
Raised if
collection.type is not CollectionType.CALIBRATION.
- abstract clone(*, db: Database, collections: CollectionManager, dimensions: DimensionRecordStorageManager, caching_context: CachingContext) DatasetRecordStorageManager¶
Make an independent copy of this manager instance bound to new instances of
Databaseand other managers.- Parameters:
- db
Database New
Databaseobject to use when instantiating the manager.- collections
CollectionManager New
CollectionManagerobject to use when instantiating the manager.- dimensions
DimensionRecordStorageManager New
DimensionRecordStorageManagerobject to use when instantiating the manager.- caching_context
CachingContext New
CachingContextobject to use when instantiating the manager.
- db
- Returns:
- instance
DatasetRecordStorageManager New manager instance with the same configuration as this instance, but bound to a new Database object.
- instance
- conform_exact_dataset_type(dataset_type: DatasetType | str) DatasetType¶
Conform a value that may be a dataset type or dataset type name to just the dataset type name, while checking that the dataset type is not a component and (if a
DatasetTypeinstance is given) has the exact same definition in the registry.- Parameters:
- dataset_type
strorDatasetType Dataset type object or name.
- dataset_type
- Returns:
- dataset_type
DatasetType The corresponding registered dataset type.
- dataset_type
- Raises:
- DatasetTypeError
Raised if
dataset_typeis a component, or if its definition does not exactly match the registered dataset type.- MissingDatasetTypeError
Raised if this dataset type is not registered at all.
- abstract decertify(dataset_type: DatasetType, collection: CollectionRecord, timespan: Timespan, *, data_ids: Iterable[DataCoordinate] | None = None, context: SqlQueryContext) None¶
Remove or adjust datasets to clear a validity range within a calibration collection.
- Parameters:
- dataset_type
DatasetType Type of all datasets.
- collection
CollectionRecord The record object describing the collection.
collection.typemust beCALIBRATION.- timespan
Timespan The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two.
- data_ids
Iterable[DataCoordinate], optional Data IDs that should be decertified within the given validity range If
None, all data IDs fordataset_typeincollectionwill be decertified.- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- dataset_type
- Raises:
- DatasetTypeError
Raised if
dataset_type.isCalibration() is False.- CollectionTypeError
Raised if
collection.type is not CollectionType.CALIBRATION.
- abstract delete(datasets: Iterable[UUID | DatasetRef]) None¶
Fully delete the given datasets from the registry.
- Parameters:
- datasets
Iterable[DatasetIdorDatasetRef] Datasets to be deleted. If
DatasetRefinstances are passed, only theDatasetRef.idattribute is used.
- datasets
- abstract disassociate(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
Remove one or more datasets from a collection.
- Parameters:
- dataset_type
DatasetType Type of all datasets.
- collection
CollectionRecord The record object describing the collection.
collection.typemust beTAGGED.- datasets
Iterable[DatasetRef] Datasets to be disassociated. All datasets must have the same
DatasetTypeasdataset_type, but this is not checked.
- dataset_type
- abstract fetch_summaries(collections: Iterable[CollectionRecord], dataset_types: Iterable[DatasetType] | Iterable[str] | None = None) Mapping[Any, CollectionSummary]¶
Fetch collection summaries given their names and dataset types.
- Parameters:
- collections
Iterable[CollectionRecord] Collection records to query.
- dataset_types
Iterable[DatasetType] orNone Dataset types to include into returned summaries. If
Nonethen all dataset types will be included.
- collections
- Returns:
- summaries
Mapping[Any,CollectionSummary] Collection summaries indexed by collection record key. This mapping will also contain all nested non-chained collections of the chained collections.
- summaries
- abstract getCollectionSummary(collection: CollectionRecord) CollectionSummary¶
Return a summary for the given collection.
- Parameters:
- collection
CollectionRecord Record describing the collection for which a summary is to be retrieved.
- collection
- Returns:
- summary
CollectionSummary Summary of the dataset types and governor dimension values in this collection.
- summary
- abstract getDatasetRef(id: UUID) DatasetRef | None¶
Return a
DatasetReffor the given dataset primary key value.
- abstract get_dataset_type(name: str) DatasetType¶
Look up a dataset type by name.
- Parameters:
- name
str Name of a parent dataset type.
- name
- Returns:
- dataset_type
DatasetType The object representing the records for the given dataset type.
- dataset_type
- Raises:
- MissingDatasetTypeError
Raised if there is no dataset type with the given name.
- abstract import_(run: RunRecord, refs: list[DatasetRef]) None¶
Insert one or more dataset entries into the database.
- abstract classmethod initialize(db: Database, context: StaticTablesContext, *, collections: CollectionManager, dimensions: DimensionRecordStorageManager, caching_context: CachingContext, registry_schema_version: VersionTuple | None = None) DatasetRecordStorageManager¶
Construct an instance of the manager.
- Parameters:
- db
Database Interface to the underlying database engine and namespace.
- context
StaticTablesContext Context object obtained from
Database.declareStaticTables; used to declare any tables that should always be present.- collections
CollectionManager Manager object for the collections in this
Registry.- dimensions
DimensionRecordStorageManager Manager object for the dimensions in this
Registry.- caching_context
CachingContext Object controlling caching of information returned by managers.
- registry_schema_version
VersionTupleorNone Schema version of this extension as defined in registry.
- db
- Returns:
- manager
DatasetRecordStorageManager An instance of a concrete
DatasetRecordStorageManagersubclass.
- manager
- abstract insert(dataset_type_name: str, run: RunRecord, data_ids: Iterable[DataCoordinate], id_generation_mode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE) list[DatasetRef]¶
Insert one or more dataset entries into the database.
- Parameters:
- dataset_type_name
str Name of the dataset type.
- run
RunRecord The record object describing the
RUNcollection these datasets will be associated with.- data_ids
Iterable[DataCoordinate] Expanded data IDs (
DataCoordinateinstances) for the datasets to be added. The dimensions of all data IDs must be the same asdataset_type.dimensions.- id_generation_mode
DatasetIdGenEnum With
UNIQUEeach new dataset is inserted with its new unique ID. With non-UNIQUEmode ID is computed from some combination of dataset type, dataId, and run collection name; if the same ID is already in the database then new record is not inserted.
- dataset_type_name
- Returns:
- datasets
list[DatasetRef] References to the inserted datasets.
- datasets
- abstract make_joins_builder(dataset_type: DatasetType, collections: Sequence[CollectionRecord], fields: Set[str], is_union: bool = False) SqlJoinsBuilder¶
Make a
direct_query_driver.SqlJoinsBuilderthat represents a search for datasets of this type.- Parameters:
- dataset_type
DatasetType Type of dataset to query for.
- collections
Sequence[CollectionRecord] Collections to search, in order, after filtering out collections with no datasets of this type via collection summaries.
- fields
Set[str] Names of fields to make available in the builder. Options include:
dataset_id(UUID)run(collection name,str)collection(collection name,str)collection_key(collection primary key, manager-dependent)timespan(validity range, or unbounded for non-calibrations)ingest_date(time dataset was ingested into repository)
Dimension keys for the dataset type’s required dimensions are always included.
- is_union
bool, optional If
True, this search is being joined in as part of one term in a union over all dataset types. This causes fields to be added to the builder via the special...instad of the dataset type name.
- dataset_type
- Returns:
- builder
direct_query_driver.SqlJoinsBuilder A query-construction object representing a table or subquery.
- builder
- abstract make_relation(dataset_type: DatasetType, *collections: CollectionRecord, columns: Set[str], context: SqlQueryContext) Relation¶
Return a
sql.Relationthat represents a query for thisDatasetTypein one or more collections.- Parameters:
- dataset_type
DatasetType Type of dataset to query for.
- *collections
CollectionRecord The record object(s) describing the collection(s) to query. May not be of type
CollectionType.CHAINED. If multiple collections are passed, the query will search all of them in an unspecified order, and all collections must have the same type. Must include at least one collection.- columns
Set[str] Columns to include in the relation. See
Query.find_datasetsfor most options, but this method supports one more:rank: a calculated integer column holding the index of thecollection the dataset was found in, within the
collectionssequence given.
- context
SqlQueryContext The object that manages database connections, temporary tables and relation engines for this query.
- dataset_type
- Returns:
- relation
Relation Representation of the query.
- relation
- abstract preload_cache() None¶
Fetch data from the database and use it to pre-populate caches to speed up later operations.
- abstract refresh() None¶
Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed.
- abstract refresh_collection_summaries(dataset_type: DatasetType) None¶
Make sure that collection summaries for this dataset type are consistent with the contents of the dataset tables.
- Parameters:
- dataset_type
DatasetType Dataset type whose summary entries should be refreshed.
- dataset_type
- abstract register_dataset_type(dataset_type: DatasetType) bool¶
Ensure that this
Registrycan hold records for the givenDatasetType, creating new tables as necessary.- Parameters:
- dataset_type
DatasetType Dataset type for which a table should created (as necessary) and an associated
DatasetRecordStoragereturned.
- dataset_type
- Returns:
Notes
This operation may not be invoked within a
Database.transactioncontext.
- abstract remove_dataset_type(name: str) None¶
Remove the dataset type.
- Parameters:
- name
str Name of the dataset type.
- name
- abstract resolve_wildcard(expression: Any, missing: list[str] | None = None, explicit_only: bool = False) list[lsst.daf.butler._dataset_type.DatasetType]¶
Resolve a dataset type wildcard expression.
- Parameters:
- expression
Any Expression to resolve. Will be passed to
DatasetTypeWildcard.from_expression.- missing
listofstr, optional String dataset type names that were explicitly given (i.e. not regular expression patterns) but not found will be appended to this list, if it is provided.
- explicit_only
bool, optional If
True, require explicitDatasetTypeinstances orstrnames, withre.Patterninstances deprecated and...prohibited.
- expression
- Returns:
- dataset_types
list[DatasetType] A list of resolved dataset types.
- dataset_types