DatasetRecordStorageManager

class lsst.daf.butler.registry.interfaces.DatasetRecordStorageManager(*, registry_schema_version: VersionTuple | None = None)

Bases: VersionedExtension

An interface that manages the tables that describe datasets.

DatasetRecordStorageManager primarily serves as a container and factory for DatasetRecordStorage instances, which each provide access to the records for a different DatasetType.

Parameters:
registry_schema_versionVersionTuple or None, optional

Version of registry schema.

Methods Summary

addDatasetForeignKey(tableSpec, *[, name, ...])

Add a foreign key (field and constraint) referencing the dataset table.

checkCompatibility(registry_schema_version, ...)

Check that schema version defined in registry is compatible with current implementation.

checkNewSchemaVersion(schema_version)

Verify that requested schema version can be created by an extension.

clsNewSchemaVersion(schema_version)

Class method which returns schema version to use for newly created registry database.

currentVersions()

Return schema version(s) supported by this extension class.

extensionName()

Return full name of the extension.

fetch_summaries(collections[, dataset_types])

Fetch collection summaries given their names and dataset types.

find(name)

Return an object that provides access to the records associated with the given DatasetType name, if one exists.

getCollectionSummary(collection)

Return a summary for the given collection.

getDatasetRef(id)

Return a DatasetRef for the given dataset primary key value.

getIdColumnType()

Return type used for columns storing dataset IDs.

ingest_date_dtype()

Return type of the ingest_date column.

initialize(db, context, *, collections, ...)

Construct an instance of the manager.

newSchemaVersion()

Return schema version for newly created registry.

refresh()

Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed.

register(datasetType)

Ensure that this Registry can hold records for the given DatasetType, creating new tables as necessary.

remove(name)

Remove the dataset type.

resolve_wildcard(expression[, components, ...])

Resolve a dataset type wildcard expression.

supportsIdGenerationMode(mode)

Test whether the given dataset ID generation mode is supported by insert.

Methods Documentation

abstract classmethod addDatasetForeignKey(tableSpec: TableSpec, *, name: str = 'dataset', constraint: bool = True, onDelete: str | None = None, **kwargs: Any) FieldSpec

Add a foreign key (field and constraint) referencing the dataset table.

Parameters:
tableSpecddl.TableSpec

Specification for the table that should reference the dataset table. Will be modified in place.

namestr, optional

A name to use for the prefix of the new field; the full name is {name}_id.

constraintbool, optional

If False (True is default), add a field that can be joined to the dataset primary key, but do not add a foreign key constraint.

onDeletestr, optional

One of “CASCADE” or “SET NULL”, indicating what should happen to the referencing row if the collection row is deleted. None indicates that this should be an integrity error.

**kwargs

Additional keyword arguments are forwarded to the ddl.FieldSpec constructor (only the name and dtype arguments are otherwise provided).

Returns:
idSpecddl.FieldSpec

Specification for the ID field.

classmethod checkCompatibility(registry_schema_version: VersionTuple, update: bool) None

Check that schema version defined in registry is compatible with current implementation.

Parameters:
registry_schema_versionVersionTuple

Schema version that exists in registry or defined in a configuration for a registry to be created.

updatebool

If True then read-write access is expected.

Raises:
IncompatibleVersionError

Raised if schema version is not supported by implementation.

Notes

Default implementation uses VersionTuple.checkCompatibility on the versions returned from currentVersions method. Subclasses that support different compatibility model will overwrite this method.

classmethod checkNewSchemaVersion(schema_version: VersionTuple) None

Verify that requested schema version can be created by an extension.

Parameters:
schema_versionVersionTuple

Schema version that this extension is asked to create.

Notes

This method may be used only occasionally when a specific schema version is given in a regisitry config file. This can be used with an extension that supports multiple schem versions to make it create new schema with a non-default version number. Default implementation compares requested version with one of the version returned from currentVersions.

classmethod clsNewSchemaVersion(schema_version: VersionTuple | None) VersionTuple | None

Class method which returns schema version to use for newly created registry database.

Parameters:
schema_versionVersionTuple or None

Configured schema version or None if default schema version should be created. If not None then it is guaranteed to be compatible with currentVersions.

Returns:
versionVersionTuple or None

Schema version created by this extension. None is returned if an extension does not require its version to be saved or checked.

Notes

Default implementation of this method can work in simple cases. If the extension only supports single schema version than that version is returned. If the extension supports multiple schema versions and schema_version is not None then schema_version is returned. If the extension supports multiple schema versions, but schema_version is None it calls _newDefaultSchemaVersion method which needs to be reimplemented in a subsclass.

abstract classmethod currentVersions() list[lsst.daf.butler.registry.interfaces._versioning.VersionTuple]

Return schema version(s) supported by this extension class.

Returns:
versionlist [VersionTuple]

Schema versions for this extension. Empty list is returned if an extension does not require its version to be saved or checked.

classmethod extensionName() str

Return full name of the extension.

This name should match the name defined in registry configuration. It is also stored in registry attributes. Default implementation returns full class name.

Returns:
namestr

Full extension name.

abstract fetch_summaries(collections: Iterable[CollectionRecord], dataset_types: Iterable[DatasetType] | None = None) Mapping[Any, CollectionSummary]

Fetch collection summaries given their names and dataset types.

Parameters:
collectionsIterable [CollectionRecord]

Collection records to query.

dataset_typesIterable [DatasetType] or None

Dataset types to include into returned summaries. If None then all dataset types will be included.

Returns:
summariesMapping [Any, CollectionSummary]

Collection summaries indexed by collection record key. This mapping will also contain all nested non-chained collections of the chained collections.

abstract find(name: str) DatasetRecordStorage | None

Return an object that provides access to the records associated with the given DatasetType name, if one exists.

Parameters:
namestr

Name of the dataset type.

Returns:
recordsDatasetRecordStorage or None

The object representing the records for the given dataset type, or None if there are no records for that dataset type.

Notes

Dataset types registered by another client of the same repository since the last call to initialize or refresh may not be found.

abstract getCollectionSummary(collection: CollectionRecord) CollectionSummary

Return a summary for the given collection.

Parameters:
collectionCollectionRecord

Record describing the collection for which a summary is to be retrieved.

Returns:
summaryCollectionSummary

Summary of the dataset types and governor dimension values in this collection.

abstract getDatasetRef(id: UUID) DatasetRef | None

Return a DatasetRef for the given dataset primary key value.

Parameters:
idDatasetId

Primary key value for the dataset.

Returns:
refDatasetRef or None

Object representing the dataset, or None if no dataset with the given primary key values exists in this layer.

abstract classmethod getIdColumnType() type

Return type used for columns storing dataset IDs.

This type is used for columns storing DatasetRef.id values, usually a type subclass provided by SQLAlchemy.

Returns:
dtypetype

Type used for dataset identification in database.

abstract ingest_date_dtype() type

Return type of the ingest_date column.

abstract classmethod initialize(db: Database, context: StaticTablesContext, *, collections: CollectionManager, dimensions: DimensionRecordStorageManager, caching_context: CachingContext, registry_schema_version: VersionTuple | None = None) DatasetRecordStorageManager

Construct an instance of the manager.

Parameters:
dbDatabase

Interface to the underlying database engine and namespace.

contextStaticTablesContext

Context object obtained from Database.declareStaticTables; used to declare any tables that should always be present.

collectionsCollectionManager

Manager object for the collections in this Registry.

dimensionsDimensionRecordStorageManager

Manager object for the dimensions in this Registry.

caching_contextCachingContext

Object controlling caching of information returned by managers.

registry_schema_versionVersionTuple or None

Schema version of this extension as defined in registry.

Returns:
managerDatasetRecordStorageManager

An instance of a concrete DatasetRecordStorageManager subclass.

newSchemaVersion() VersionTuple | None

Return schema version for newly created registry.

Returns:
versionVersionTuple or None

Schema version created by this extension. None is returned if an extension does not require its version to be saved or checked.

Notes

Extension classes that support multiple schema versions need to override _newDefaultSchemaVersion method.

abstract refresh() None

Ensure all other operations on this manager are aware of any dataset types that may have been registered by other clients since it was initialized or last refreshed.

abstract register(datasetType: DatasetType) bool

Ensure that this Registry can hold records for the given DatasetType, creating new tables as necessary.

Parameters:
datasetTypeDatasetType

Dataset type for which a table should created (as necessary) and an associated DatasetRecordStorage returned.

Returns:
insertedbool

True if the dataset type did not exist in the registry before.

Notes

This operation may not be invoked within a Database.transaction context.

abstract remove(name: str) None

Remove the dataset type.

Parameters:
namestr

Name of the dataset type.

abstract resolve_wildcard(expression: Any, components: bool | None = False, missing: list[str] | None = None, explicit_only: bool = False, components_deprecated: bool = True) dict[lsst.daf.butler._dataset_type.DatasetType, list[str | None]]

Resolve a dataset type wildcard expression.

Parameters:
expressionAny

Expression to resolve. Will be passed to DatasetTypeWildcard.from_expression.

componentsbool, optional

If True, apply all expression patterns to component dataset type names as well. If False, never apply patterns to components. If None, apply patterns to components only if their parent datasets were not matched by the expression. Fully-specified component datasets (str or DatasetType instances) are always included.

missinglist of str, optional

String dataset type names that were explicitly given (i.e. not regular expression patterns) but not found will be appended to this list, if it is provided.

explicit_onlybool, optional

If True, require explicit DatasetType instances or str names, with re.Pattern instances deprecated and ... prohibited.

components_deprecatedbool, optional

If True, this is a context in which component dataset support is deprecated. This will result in a deprecation warning when components=True or components=None and a component dataset is matched. In the future this will become an error.

Returns:
dataset_typesdict [ DatasetType, list [ None, str ] ]

A mapping with resolved dataset types as keys and lists of matched component names as values, where None indicates the parent composite dataset type was matched.

abstract classmethod supportsIdGenerationMode(mode: DatasetIdGenEnum) bool

Test whether the given dataset ID generation mode is supported by insert.

Parameters:
modeDatasetIdGenEnum

Enum value for the mode to test.

Returns:
supportedbool

Whether the given mode is supported.