DatasetRecordStorageManager¶
- class lsst.daf.butler.registry.interfaces.DatasetRecordStorageManager(*, registry_schema_version: VersionTuple | None = None)¶
- Bases: - VersionedExtension- An interface that manages the tables that describe datasets. - DatasetRecordStorageManagerprimarily serves as a container and factory for- DatasetRecordStorageinstances, which each provide access to the records for a different- DatasetType.- Parameters:
- registry_schema_versionVersionTupleorNone, optional
- Version of registry schema. 
 
- registry_schema_version
 - Methods Summary - addDatasetForeignKey(tableSpec, *[, name, ...])- Add a foreign key (field and constraint) referencing the dataset table. - associate(dataset_type, collection, datasets)- Associate one or more datasets with a collection. - certify(dataset_type, collection, datasets, ...)- Associate one or more datasets with a calibration collection and a validity range within it. - clone(*, db, collections, dimensions, ...)- Make an independent copy of this manager instance bound to new instances of - Databaseand other managers.- conform_exact_dataset_type(dataset_type)- Conform a value that may be a dataset type or dataset type name to just the dataset type name, while checking that the dataset type is not a component and (if a - DatasetTypeinstance is given) has the exact same definition in the registry.- decertify(dataset_type, collection, timespan, *)- Remove or adjust datasets to clear a validity range within a calibration collection. - delete(datasets)- Fully delete the given datasets from the registry. - disassociate(dataset_type, collection, datasets)- Remove one or more datasets from a collection. - fetch_summaries(collections[, dataset_types])- Fetch collection summaries given their names and dataset types. - getCollectionSummary(collection)- Return a summary for the given collection. - getDatasetRef(id)- Return a - DatasetReffor the given dataset primary key value.- get_dataset_type(name)- Look up a dataset type by name. - import_(dataset_type, run, data_ids)- Insert one or more dataset entries into the database. - Return type of the - ingest_datecolumn.- initialize(db, context, *, collections, ...)- Construct an instance of the manager. - insert(dataset_type_name, run, data_ids[, ...])- Insert one or more dataset entries into the database. - make_joins_builder(dataset_type, ...[, is_union])- Make a - direct_query_driver.SqlJoinsBuilderthat represents a search for datasets of this type.- make_relation(dataset_type, *collections, ...)- Return a - sql.Relationthat represents a query for this- DatasetTypein one or more collections.- Fetch data from the database and use it to pre-populate caches to speed up later operations. - refresh()- Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed. - refresh_collection_summaries(dataset_type)- Make sure that collection summaries for this dataset type are consistent with the contents of the dataset tables. - register_dataset_type(dataset_type)- Ensure that this - Registrycan hold records for the given- DatasetType, creating new tables as necessary.- remove_dataset_type(name)- Remove the dataset type. - resolve_wildcard(expression[, missing, ...])- Resolve a dataset type wildcard expression. - Methods Documentation - abstract classmethod addDatasetForeignKey(tableSpec: TableSpec, *, name: str = 'dataset', constraint: bool = True, onDelete: str | None = None, **kwargs: Any) FieldSpec¶
- Add a foreign key (field and constraint) referencing the dataset table. - Parameters:
- tableSpecddl.TableSpec
- Specification for the table that should reference the dataset table. Will be modified in place. 
- namestr, optional
- A name to use for the prefix of the new field; the full name is - {name}_id.
- constraintbool, optional
- If - False(- Trueis default), add a field that can be joined to the dataset primary key, but do not add a foreign key constraint.
- onDeletestr, optional
- One of “CASCADE” or “SET NULL”, indicating what should happen to the referencing row if the collection row is deleted. - Noneindicates that this should be an integrity error.
- **kwargs
- Additional keyword arguments are forwarded to the - ddl.FieldSpecconstructor (only the- nameand- dtypearguments are otherwise provided).
 
- tableSpec
- Returns:
- idSpecddl.FieldSpec
- Specification for the ID field. 
 
- idSpec
 
 - abstract associate(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
- Associate one or more datasets with a collection. - Parameters:
- dataset_typeDatasetType
- Type of all datasets. 
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- TAGGED.
- datasetsIterable[DatasetRef]
- Datasets to be associated. All datasets must have the same - DatasetTypeas- dataset_type, but this is not checked.
 
- dataset_type
 - Notes - Associating a dataset into collection that already contains a different dataset with the same - DatasetTypeand data ID will remove the existing dataset from that collection.- Associating the same dataset into a collection multiple times is a no-op, but is still not permitted on read-only databases. 
 - abstract certify(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef], timespan: Timespan, context: SqlQueryContext) None¶
- Associate one or more datasets with a calibration collection and a validity range within it. - Parameters:
- dataset_typeDatasetType
- Type of all datasets. 
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- CALIBRATION.
- datasetsIterable[DatasetRef]
- Datasets to be associated. All datasets must have the same - DatasetTypeas- dataset_type, but this is not checked.
- timespanTimespan
- The validity range for these datasets within the collection. 
- contextSqlQueryContext
- The object that manages database connections, temporary tables and relation engines for this query. 
 
- dataset_type
- Raises:
- ConflictingDefinitionError
- Raised if the collection already contains a different dataset with the same - DatasetTypeand data ID and an overlapping validity range.
- DatasetTypeError
- Raised if - dataset_type.isCalibration() is False.
- CollectionTypeError
- Raised if - collection.type is not CollectionType.CALIBRATION.
 
 
 - abstract clone(*, db: Database, collections: CollectionManager, dimensions: DimensionRecordStorageManager, caching_context: CachingContext) DatasetRecordStorageManager¶
- Make an independent copy of this manager instance bound to new instances of - Databaseand other managers.- Parameters:
- dbDatabase
- New - Databaseobject to use when instantiating the manager.
- collectionsCollectionManager
- New - CollectionManagerobject to use when instantiating the manager.
- dimensionsDimensionRecordStorageManager
- New - DimensionRecordStorageManagerobject to use when instantiating the manager.
- caching_contextCachingContext
- New - CachingContextobject to use when instantiating the manager.
 
- db
- Returns:
- instanceDatasetRecordStorageManager
- New manager instance with the same configuration as this instance, but bound to a new Database object. 
 
- instance
 
 - conform_exact_dataset_type(dataset_type: DatasetType | str) DatasetType¶
- Conform a value that may be a dataset type or dataset type name to just the dataset type name, while checking that the dataset type is not a component and (if a - DatasetTypeinstance is given) has the exact same definition in the registry.- Parameters:
- dataset_typestrorDatasetType
- Dataset type object or name. 
 
- dataset_type
- Returns:
- dataset_typeDatasetType
- The corresponding registered dataset type. 
 
- dataset_type
- Raises:
- DatasetTypeError
- Raised if - dataset_typeis a component, or if its definition does not exactly match the registered dataset type.
- MissingDatasetTypeError
- Raised if this dataset type is not registered at all. 
 
 
 - abstract decertify(dataset_type: DatasetType, collection: CollectionRecord, timespan: Timespan, *, data_ids: Iterable[DataCoordinate] | None = None, context: SqlQueryContext) None¶
- Remove or adjust datasets to clear a validity range within a calibration collection. - Parameters:
- dataset_typeDatasetType
- Type of all datasets. 
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- CALIBRATION.
- timespanTimespan
- The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two. 
- data_idsIterable[DataCoordinate], optional
- Data IDs that should be decertified within the given validity range If - None, all data IDs for- dataset_typein- collectionwill be decertified.
- contextSqlQueryContext
- The object that manages database connections, temporary tables and relation engines for this query. 
 
- dataset_type
- Raises:
- DatasetTypeError
- Raised if - dataset_type.isCalibration() is False.
- CollectionTypeError
- Raised if - collection.type is not CollectionType.CALIBRATION.
 
 
 - abstract delete(datasets: Iterable[UUID | DatasetRef]) None¶
- Fully delete the given datasets from the registry. - Parameters:
- datasetsIterable[DatasetIdorDatasetRef]
- Datasets to be deleted. If - DatasetRefinstances are passed, only the- DatasetRef.idattribute is used.
 
- datasets
 
 - abstract disassociate(dataset_type: DatasetType, collection: CollectionRecord, datasets: Iterable[DatasetRef]) None¶
- Remove one or more datasets from a collection. - Parameters:
- dataset_typeDatasetType
- Type of all datasets. 
- collectionCollectionRecord
- The record object describing the collection. - collection.typemust be- TAGGED.
- datasetsIterable[DatasetRef]
- Datasets to be disassociated. All datasets must have the same - DatasetTypeas- dataset_type, but this is not checked.
 
- dataset_type
 
 - abstract fetch_summaries(collections: Iterable[CollectionRecord], dataset_types: Iterable[DatasetType] | Iterable[str] | None = None) Mapping[Any, CollectionSummary]¶
- Fetch collection summaries given their names and dataset types. - Parameters:
- collectionsIterable[CollectionRecord]
- Collection records to query. 
- dataset_typesIterable[DatasetType] orNone
- Dataset types to include into returned summaries. If - Nonethen all dataset types will be included.
 
- collections
- Returns:
- summariesMapping[Any,CollectionSummary]
- Collection summaries indexed by collection record key. This mapping will also contain all nested non-chained collections of the chained collections. 
 
- summaries
 
 - abstract getCollectionSummary(collection: CollectionRecord) CollectionSummary¶
- Return a summary for the given collection. - Parameters:
- collectionCollectionRecord
- Record describing the collection for which a summary is to be retrieved. 
 
- collection
- Returns:
- summaryCollectionSummary
- Summary of the dataset types and governor dimension values in this collection. 
 
- summary
 
 - abstract getDatasetRef(id: UUID) DatasetRef | None¶
- Return a - DatasetReffor the given dataset primary key value.
 - abstract get_dataset_type(name: str) DatasetType¶
- Look up a dataset type by name. - Parameters:
- namestr
- Name of a parent dataset type. 
 
- name
- Returns:
- dataset_typeDatasetType
- The object representing the records for the given dataset type. 
 
- dataset_type
- Raises:
- MissingDatasetTypeError
- Raised if there is no dataset type with the given name. 
 
 
 - abstract import_(dataset_type: DatasetType, run: RunRecord, data_ids: Mapping[DatasetId, DataCoordinate]) list[DatasetRef]¶
- Insert one or more dataset entries into the database. - Parameters:
- Returns:
- datasetslist[DatasetRef]
- References to the inserted or existing datasets. 
 
- datasets
 
 - abstract classmethod initialize(db: Database, context: StaticTablesContext, *, collections: CollectionManager, dimensions: DimensionRecordStorageManager, caching_context: CachingContext, registry_schema_version: VersionTuple | None = None) DatasetRecordStorageManager¶
- Construct an instance of the manager. - Parameters:
- dbDatabase
- Interface to the underlying database engine and namespace. 
- contextStaticTablesContext
- Context object obtained from - Database.declareStaticTables; used to declare any tables that should always be present.
- collectionsCollectionManager
- Manager object for the collections in this - Registry.
- dimensionsDimensionRecordStorageManager
- Manager object for the dimensions in this - Registry.
- caching_contextCachingContext
- Object controlling caching of information returned by managers. 
- registry_schema_versionVersionTupleorNone
- Schema version of this extension as defined in registry. 
 
- db
- Returns:
- managerDatasetRecordStorageManager
- An instance of a concrete - DatasetRecordStorageManagersubclass.
 
- manager
 
 - abstract insert(dataset_type_name: str, run: RunRecord, data_ids: Iterable[DataCoordinate], id_generation_mode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE) list[DatasetRef]¶
- Insert one or more dataset entries into the database. - Parameters:
- dataset_type_namestr
- Name of the dataset type. 
- runRunRecord
- The record object describing the - RUNcollection these datasets will be associated with.
- data_idsIterable[DataCoordinate]
- Expanded data IDs ( - DataCoordinateinstances) for the datasets to be added. The dimensions of all data IDs must be the same as- dataset_type.dimensions.
- id_generation_modeDatasetIdGenEnum
- With - UNIQUEeach new dataset is inserted with its new unique ID. With non-- UNIQUEmode ID is computed from some combination of dataset type, dataId, and run collection name; if the same ID is already in the database then new record is not inserted.
 
- dataset_type_name
- Returns:
- datasetslist[DatasetRef]
- References to the inserted datasets. 
 
- datasets
 
 - abstract make_joins_builder(dataset_type: DatasetType, collections: Sequence[CollectionRecord], fields: Set[str], is_union: bool = False) SqlJoinsBuilder¶
- Make a - direct_query_driver.SqlJoinsBuilderthat represents a search for datasets of this type.- Parameters:
- dataset_typeDatasetType
- Type of dataset to query for. 
- collectionsSequence[CollectionRecord]
- Collections to search, in order, after filtering out collections with no datasets of this type via collection summaries. 
- fieldsSet[str]
- Names of fields to make available in the builder. Options include: - dataset_id(UUID)
- run(collection name,- str)
- collection(collection name,- str)
- collection_key(collection primary key, manager-dependent)
- timespan(validity range, or unbounded for non-calibrations)
- ingest_date(time dataset was ingested into repository)
 - Dimension keys for the dataset type’s required dimensions are always included. 
- is_unionbool, optional
- If - True, this search is being joined in as part of one term in a union over all dataset types. This causes fields to be added to the builder via the special- ...instad of the dataset type name.
 
- dataset_type
- Returns:
- builderdirect_query_driver.SqlJoinsBuilder
- A query-construction object representing a table or subquery. 
 
- builder
 
 - abstract make_relation(dataset_type: DatasetType, *collections: CollectionRecord, columns: Set[str], context: SqlQueryContext) Relation¶
- Return a - sql.Relationthat represents a query for this- DatasetTypein one or more collections.- Parameters:
- dataset_typeDatasetType
- Type of dataset to query for. 
- *collectionsCollectionRecord
- The record object(s) describing the collection(s) to query. May not be of type - CollectionType.CHAINED. If multiple collections are passed, the query will search all of them in an unspecified order, and all collections must have the same type. Must include at least one collection.
- columnsSet[str]
- Columns to include in the relation. See - Query.find_datasetsfor most options, but this method supports one more:- rank: a calculated integer column holding the index of the
- collection the dataset was found in, within the - collectionssequence given.
 
 
- contextSqlQueryContext
- The object that manages database connections, temporary tables and relation engines for this query. 
 
- dataset_type
- Returns:
- relationRelation
- Representation of the query. 
 
- relation
 
 - abstract preload_cache() None¶
- Fetch data from the database and use it to pre-populate caches to speed up later operations. 
 - abstract refresh() None¶
- Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed. 
 - abstract refresh_collection_summaries(dataset_type: DatasetType) None¶
- Make sure that collection summaries for this dataset type are consistent with the contents of the dataset tables. - Parameters:
- dataset_typeDatasetType
- Dataset type whose summary entries should be refreshed. 
 
- dataset_type
 
 - abstract register_dataset_type(dataset_type: DatasetType) bool¶
- Ensure that this - Registrycan hold records for the given- DatasetType, creating new tables as necessary.- Parameters:
- dataset_typeDatasetType
- Dataset type for which a table should created (as necessary) and an associated - DatasetRecordStoragereturned.
 
- dataset_type
- Returns:
 - Notes - This operation may not be invoked within a - Database.transactioncontext.
 - abstract remove_dataset_type(name: str) None¶
- Remove the dataset type. - Parameters:
- namestr
- Name of the dataset type. 
 
- name
 
 - abstract resolve_wildcard(expression: Any, missing: list[str] | None = None, explicit_only: bool = False) list[lsst.daf.butler._dataset_type.DatasetType]¶
- Resolve a dataset type wildcard expression. - Parameters:
- expressionAny
- Expression to resolve. Will be passed to - DatasetTypeWildcard.from_expression.
- missinglistofstr, optional
- String dataset type names that were explicitly given (i.e. not regular expression patterns) but not found will be appended to this list, if it is provided. 
- explicit_onlybool, optional
- If - True, require explicit- DatasetTypeinstances or- strnames, with- re.Patterninstances deprecated and- ...prohibited.
 
- expression
- Returns:
- dataset_typeslist[DatasetType]
- A list of resolved dataset types. 
 
- dataset_types