DimensionRecordStorageManager

class lsst.daf.butler.registry.interfaces.DimensionRecordStorageManager(*, universe: DimensionUniverse, registry_schema_version: VersionTuple | None = None)

Bases: VersionedExtension

An interface for managing the dimension records in a Registry.

DimensionRecordStorageManager primarily serves as a container and factory for DimensionRecordStorage instances, which each provide access to the records for a different DimensionElement.

Parameters:
universeDimensionUniverse

Universe of all dimensions and dimension elements known to the Registry.

registry_schema_versionVersionTuple or None, optional

Version of registry schema.

Notes

In a multi-layer Registry, many dimension elements will only have records in one layer (often the base layer). The union of the records across all layers forms the logical table for the full Registry.

Methods Summary

clone(db)

Make an independent copy of this manager instance bound to a new Database instance.

fetch_cache_dict()

Return a dict that can back a DimensionRecordSet.

fetch_one(element_name, data_id, cache)

Retrieve a single record from storage.

initialize(db, context, *, universe[, ...])

Construct an instance of the manager.

insert(element, *records[, replace, ...])

Insert one or more records into storage.

join(element_name, target, join, context)

Join this dimension element's records to a relation.

load_dimension_group(key)

Retrieve a DimensionGroup that was previously saved in the database.

make_joins_builder(element, fields)

Make a direct_query_driver.SqlJoinsBuilder that represents a dimension element table.

make_spatial_join_relation(element1, ...[, ...])

Create a relation that represents the spatial join between two dimension elements.

process_query_overlaps(dimensions, ...)

Process a query's WHERE predicate and dimensions to handle spatial and temporal overlaps.

save_dimension_group(group)

Save a DimensionGroup definition to the database, allowing it to be retrieved later via the returned key.

sync(record[, update])

Synchronize a record with the database, inserting it only if it does not exist and comparing values if it does.

Methods Documentation

abstract clone(db: Database) DimensionRecordStorageManager

Make an independent copy of this manager instance bound to a new Database instance.

Parameters:
dbDatabase

New Database object to use when instantiating the manager.

Returns:
instanceDatasetRecordStorageManager

New manager instance with the same configuration as this instance, but bound to a new Database object.

fetch_cache_dict() dict[str, lsst.daf.butler.dimensions._record_set.DimensionRecordSet]

Return a dict that can back a DimensionRecordSet.

This method is intended as the fetch callback argument to DimensionRecordCache, in contexts where direct SQL queries are possible.

abstract fetch_one(element_name: str, data_id: DataCoordinate, cache: DimensionRecordCache) DimensionRecord | None

Retrieve a single record from storage.

Parameters:
element_namestr

Name of the dimension element for the record to fetch.

data_idDataCoordinate

Data ID of the record to fetch. Implied dimensions do not need to be present.

cacheDimensionRecordCache

Cache to look in first.

Returns:
recordDimensionRecord or None

Fetched record, or possibly None if there was no match for the given data ID.

abstract classmethod initialize(db: Database, context: StaticTablesContext, *, universe: DimensionUniverse, registry_schema_version: VersionTuple | None = None) DimensionRecordStorageManager

Construct an instance of the manager.

Parameters:
dbDatabase

Interface to the underlying database engine and namespace.

contextStaticTablesContext

Context object obtained from Database.declareStaticTables; used to declare any tables that should always be present in a layer implemented with this manager.

universeDimensionUniverse

Universe graph containing dimensions known to this Registry.

registry_schema_versionVersionTuple or None

Schema version of this extension as defined in registry.

Returns:
managerDimensionRecordStorageManager

An instance of a concrete DimensionRecordStorageManager subclass.

abstract insert(element: DimensionElement, *records: DimensionRecord, replace: bool = False, skip_existing: bool = False) None

Insert one or more records into storage.

Parameters:
elementDimensionElement

Dimension element that provides the definition for records.

*recordsDimensionRecord

One or more instances of the DimensionRecord subclass for the element this storage is associated with.

replacebool, optional

If True (False is default), replace existing records in the database if there is a conflict.

skip_existingbool, optional

If True (False is default), skip insertion if a record with the same primary key values already exists.

Raises:
TypeError

Raised if the element does not support record insertion.

sqlalchemy.exc.IntegrityError

Raised if one or more records violate database integrity constraints.

abstract join(element_name: str, target: Relation, join: Join, context: queries.SqlQueryContext) Relation

Join this dimension element’s records to a relation.

Parameters:
element_namestr

Name of the dimension element whose relation should be joined in.

targetRelation

Existing relation to join to. Implementations may require that this relation already include dimension key columns for this dimension element and assume that dataset or spatial join relations that might provide these will be included in the relation tree first.

joinJoin

Join operation to use when the implementation is an actual join. When a true join is being simulated by other relation operations, this objects min_columns and max_columns should still be respected.

contextqueries.SqlQueryContext

Object that manages relation engines and database-side state (e.g. temporary tables) for the query.

Returns:
joinedRelation

New relation that includes this relation’s dimension key and record columns, as well as all columns in target, with rows constrained to those for which this element’s dimension key values exist in the registry and rows already exist in target.

abstract load_dimension_group(key: int) DimensionGroup

Retrieve a DimensionGroup that was previously saved in the database.

Parameters:
keyint

Integer used as the unique key for this DimensionGroup in the database.

Returns:
dimensionsDimensionGroup

Retrieved dimensions.

Raises:
KeyError

Raised if the given key cannot be found in the database.

abstract make_joins_builder(element: DimensionElement, fields: Set[str]) SqlJoinsBuilder

Make a direct_query_driver.SqlJoinsBuilder that represents a dimension element table.

Parameters:
elementDimensionElement

Dimension element the table corresponds to.

fieldsSet [ str ]

Names of fields to make available in the builder. These can be any metadata or alternate key field in the element’s schema, including the special region and timespan fields. Dimension keys in the element’s schema are always included.

Returns:
builderdirect_query_driver.SqlJoinsBuilder

A query-construction object representing a table or subquery. This is guaranteed to have rows that are unique over dimension keys and all possible key values for this dimension, so joining in a dimension element table:

  • never introduces duplicates into the query’s result rows;

  • never restricts the query’s rows except to ensure required-implied relationships are followed.

abstract make_spatial_join_relation(element1: str, element2: str, context: queries.SqlQueryContext, existing_relationships: Set[frozenset[str]] = frozenset({})) tuple[Relation, bool]

Create a relation that represents the spatial join between two dimension elements.

Parameters:
element1str

Name of one of the elements participating in the join.

element2str

Name of the other element participating in the join.

contextqueries.SqlQueryContext

Object that manages relation engines and database-side state (e.g. temporary tables) for the query.

existing_relationshipsSet [ frozenset [ str ] ], optional

Relationships between dimensions that are already present in the relation the result will be joined to. Spatial join relations that duplicate these relationships will not be included in the result, which may cause an identity relation to be returned if a spatial relationship has already been established.

Returns:
relationlsst.daf.relation.Relation

New relation that represents a spatial join between the two given elements. Guaranteed to have key columns for all required dimensions of both elements.

needs_refinementbool

Whether the returned relation represents a conservative join that needs refinement via native-iteration predicate.

abstract process_query_overlaps(dimensions: DimensionGroup, predicate: Predicate, join_operands: Iterable[DimensionGroup], calibration_dataset_types: Set[str | AnyDatasetType]) tuple[Predicate, SqlSelectBuilder, Postprocessing]

Process a query’s WHERE predicate and dimensions to handle spatial and temporal overlaps.

Parameters:
dimensionsdimensions.DimensionGroup

Full dimensions of all tables to be joined into the query (even if they are not included in the query results).

predicatequeries.tree.Predicate

Boolean column expression that may contain user-provided spatial and/or temporal overlaps intermixed with other constraints.

join_operandsIterable [ dimensions.DimensionGroup ]

Dimensions of tables or subqueries that are already going to be joined into the query that may establish their own spatial or temporal relationships (e.g. a dataset search with both visit and patch dimensions).

calibration_dataset_typesSet [ str or queries.tree.AnyDatasetType ]

The names of dataset types that have been joined into the query via a search that includes at least one calibration collection.

Returns:
predicatequeries.tree.Predicate

A version of the given predicate that preserves the overall behavior of the filter while possibly rewriting overlap expressions that have been partially moved into builder as some combination of new nested predicates, joins, and postprocessing.

builderdirect_query_driver.SqlSelectBuilder

A query-construction helper object that includes any initial joins and postprocessing needed to handle overlap expression extracted from the original predicate.

postprocessingPostprocessing

Struct representing post-query processing to be done in Python.

Notes

Implementations must delegate to queries.overlaps.OverlapsVisitor (possibly by subclassing it) to ensure “automatic” spatial and temporal joins are added consistently by all query-construction implementations.

abstract save_dimension_group(group: DimensionGroup) int

Save a DimensionGroup definition to the database, allowing it to be retrieved later via the returned key.

If this dimension group has already been saved, this method just returns the key already associated with it.

Parameters:
groupDimensionGroup

Set of dimensions to save.

Returns:
keyint

Integer used as the unique key for this DimensionGroup in the database.

Raises:
TransactionInterruption

Raised if this operation is invoked within a Database.transaction context.

abstract sync(record: DimensionRecord, update: bool = False) bool | dict[str, Any]

Synchronize a record with the database, inserting it only if it does not exist and comparing values if it does.

Parameters:
recordDimensionRecord

An instance of the DimensionRecord subclass for the element this storage is associated with.

updatebool, optional

If True (False is default), update the existing record in the database if there is a conflict.

Returns:
inserted_or_updatedbool or dict

True if a new row was inserted, False if no changes were needed, or a dict mapping updated column names to their old values if an update was performed (only possible if update=True).

Raises:
DatabaseConflictError

Raised if the record exists in the database (according to primary key lookup) but is inconsistent with the given one.

TypeError

Raised if the element does not support record synchronization.

sqlalchemy.exc.IntegrityError

Raised if one or more records violate database integrity constraints.