DatasetRegistryStorage

class lsst.daf.butler.registry.queries.DatasetRegistryStorage(connection: sqlalchemy.engine.base.Connection, universe: lsst.daf.butler.core.dimensions.universe.DimensionUniverse, tables: Mapping[str, sqlalchemy.sql.selectable.FromClause])

Bases: object

An object managing dataset and related tables in a Registry.

Parameters:
connection : sqlalchemy.engine.Connection

A SQLAlchemy connection object, typically shared with the Registry that will own the storage instances.

universe : DimensionUniverse

The set of all dimensions for which storage instances should be constructed.

tables : dict

A dictionary mapping table name to a sqlalchemy.sql.FromClause representing that table.

Notes

Future changes will convert this concrete class into a polymorphic hierarchy modeled after DimensionRecordStorage, with many more SqlRegistry method implementations delegating to it. Its interface may change significantly at the same time. At present, this functionality has been factored out of SqlRegistry (with a bit of duplication) to allow the initial QueryBuilder design and implementation to be more forward-looking.

Methods Summary

fetchDatasetTypes(datasetType, str, …) Retrieve DatasetType instances from the database matching an expression.
getDatasetSubquery(datasetType, *, …) Return a SQL expression that searches for a dataset of a particular type in one or more collections.

Methods Documentation

fetchDatasetTypes(datasetType: Union[lsst.daf.butler.core.datasets.type.DatasetType, str, lsst.daf.butler.registry.queries._datasets.Like, ellipsis] = Ellipsis, *, collections: Union[Sequence[Union[str, lsst.daf.butler.registry.queries._datasets.Like]], ellipsis] = Ellipsis, dataId: Optional[lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate] = None) → List[lsst.daf.butler.core.datasets.type.DatasetType]

Retrieve DatasetType instances from the database matching an expression.

Parameters:
datasetType : str, Like, DatasetType, or ...

An expression indicating the dataset type(s) to fetch. If this is a true DatasetType instance, it will be returned directly without querying the database. If this is a str, the DatasetType matching that name will be returned if it exists. If it is a Like expression, dataset types whose name match the expression will be returned. The special value ... fetches all dataset types. If no dataset types match, an empty list is returned.

collections : sequence of str or Like, or ...

An expression indicating collections that may be used to limit the dataset types returned to only those that might have datasets in these collections. This is intended as an optimization for higher-level functionality; it may simply be ignored, and cannot be relied upon to filter the returned dataset types.

dataId : ExpandedDataCoordinate, optional

A data ID that may be used to limit the dataset types returned to only those with datasets matching the given data ID. This is intended as an optimization for higher-level functionality; it may simply be ignored, and cannot be relied upon to filter the returned dataset types.

Returns:
datasetTypes : list of DatasetType

All datasets in the registry matching the given arguments.

getDatasetSubquery(datasetType: lsst.daf.butler.core.datasets.type.DatasetType, *, collections: Union[Sequence[Union[str, lsst.daf.butler.registry.queries._datasets.Like]], ellipsis], dataId: Optional[lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate] = None, isResult: bool = True, addRank: bool = False) → sqlalchemy.sql.selectable.FromClause

Return a SQL expression that searches for a dataset of a particular type in one or more collections.

Parameters:
datasetType : DatasetType

Type of dataset to search for. Must be a true DatasetType; call fetchDatasetTypes first to expand an expression if desired.

collections : sequence of str or Like, or ...

An expression describing the collections in which to search for the datasets. ... indicates that all collections should be searched. Returned datasets are guaranteed to be from one of the given collections (unlike the behavior of the same argument in fetchDatasetTypes).

dataId : ExpandedDataCoordinate, optional

A data ID that may be used to limit the datasets returned to only those matching the given data ID. This is intended as an optimization for higher-level functionality; it may simply be ignored, and cannot be relied upon to filter the returned dataset types.

isResult : bool, optional

If True (default), include the dataset_id column in the result columns of the query.

addRank : bool, optional

If True (False is default), also include a calculated column that ranks the collection in which the dataset was found (lower is better). Requires that all entries in collections be regular strings, so there is a clear search order. Ignored if isResult is False.

Returns:
subquery : sqlalchemy.sql.FromClause

Named subquery or table that can be used in the FROM clause of a SELECT query. Has at least columns for all dimensions in datasetType.dimensions; may have additional columns depending on the values of isResult and addRank.