QueryBuilder

class lsst.daf.butler.registry.queries.QueryBuilder(summary: QuerySummary, *, collections: CollectionManager, dimensions: DimensionRecordStorageManager, datasets: DatasetRecordStorageManager)

Bases: object

A builder for potentially complex queries that join tables based on dimension relationships.

Parameters:
summary : QuerySummary

Struct organizing the dimensions involved in the query.

collections : CollectionManager

Manager object for collection tables.

dimensions : DimensionRecordStorageManager

Manager for storage backend objects that abstract access to dimension tables.

datasets : DatasetRegistryStorage

Storage backend object that abstracts access to dataset tables.

Methods Summary

finish() Finish query constructing, returning a new Query instance.
finishJoin(table, joinOn) Complete a join on dimensions.
hasDimensionKey(dimension) Return True if the given dimension’s primary key column has been included in the query (possibly via a foreign key column on some other table).
joinDataset(datasetType, collections, *, …) Add a dataset search or constraint to the query.
joinDimensionElement(element) Add the table for a DimensionElement to the query.
joinTable(table, dimensions) Join an arbitrary table to the query via dimension relationships.
startJoin(table, dimensions, columnNames) Begin a join on dimensions.

Methods Documentation

finish() → lsst.daf.butler.registry.queries._query.Query

Finish query constructing, returning a new Query instance.

This automatically joins any missing dimension element tables (according to the categorization of the QuerySummary the builder was constructed with).

This consumes the QueryBuilder; no other methods should be called after this one.

Returns:
query : Query

A Query object that can be executed (possibly multiple times with different bind parameter values) and used to interpret result rows.

finishJoin(table: sqlalchemy.sql.selectable.FromClause, joinOn: List[sqlalchemy.sql.elements.ColumnElement]) → None

Complete a join on dimensions.

Must be preceded by call to startJoin.

Parameters:
table : sqlalchemy.sql.FromClause

SQLAlchemy object representing the logical table (which may be a join or subquery expression) to be joined. Must be the same object passed to startJoin.

joinOn : list of sqlalchemy.sql.ColumnElement

Sequence of boolean expressions that should be combined with AND to form (part of) the ON expression for this JOIN. Should include at least the elements of the list returned by startJoin.

hasDimensionKey(dimension: lsst.daf.butler.core.dimensions.elements.Dimension) → bool

Return True if the given dimension’s primary key column has been included in the query (possibly via a foreign key column on some other table).

joinDataset(datasetType: lsst.daf.butler.core.datasets.type.DatasetType, collections: Any, *, isResult: bool = True, addRank: bool = False) → bool

Add a dataset search or constraint to the query.

Unlike other QueryBuilder join methods, this must be called directly to search for datasets of a particular type or constrain the query results based on the exists of datasets. However, all dimensions used to identify the dataset type must have already been included in QuerySummary.requested when initializing the QueryBuilder.

Parameters:
datasetType : DatasetType

The type of datasets to search for.

collections : Any

An expression that fully or partially identifies the collections to search for datasets, such as a str, re.Pattern, or iterable thereof. can be used to return all collections. See Collection expressions for more information.

isResult : bool, optional

If True (default), include the dataset ID column in the result columns of the query, allowing complete DatasetRef instances to be produced from the query results for this dataset type. If False, the existence of datasets of this type is used only to constrain the data IDs returned by the query.

addRank : bool, optional

If True (False is default), also include a calculated column that ranks the collection in which the dataset was found (lower is better). Requires that all entries in collections be regular strings, so there is a clear search order. Ignored if isResult is False.

Returns:
anyRecords : bool

If True, joining the dataset table was successful and the query should proceed. If False, we were able to determine (from the combination of datasetType and collections) that there would be no results joined in from this dataset, and hence (due to the inner join that would normally be present), the full query will return no results.

joinDimensionElement(element: lsst.daf.butler.core.dimensions.elements.DimensionElement) → None

Add the table for a DimensionElement to the query.

This automatically joins the element table to all other tables in the query with which it is related, via both dimension keys and spatial and temporal relationships.

External calls to this method should rarely be necessary; finish will automatically call it if the DimensionElement has been identified as one that must be included.

Parameters:
element : DimensionElement

Element for which a table should be added. The element must be associated with a database table (see DimensionElement.hasTable).

joinTable(table: sqlalchemy.sql.selectable.FromClause, dimensions: lsst.daf.butler.core.named.NamedValueSet[lsst.daf.butler.core.dimensions.elements.Dimension][lsst.daf.butler.core.dimensions.elements.Dimension]) → None

Join an arbitrary table to the query via dimension relationships.

External calls to this method should only be necessary for tables whose records represent neither dataset nor dimension elements (i.e. extensions to the standard Registry schema).

Parameters:
table : sqlalchemy.sql.FromClause

SQLAlchemy object representing the logical table (which may be a join or subquery expression) to be joined.

dimensions : iterable of Dimension

The dimensions that relate this table to others that may be in the query. The table must have columns with the names of the dimensions.

startJoin(table: sqlalchemy.sql.selectable.FromClause, dimensions: Iterable[lsst.daf.butler.core.dimensions.elements.Dimension], columnNames: Iterable[str]) → List[sqlalchemy.sql.elements.ColumnElement]

Begin a join on dimensions.

Must be followed by call to finishJoin.

Parameters:
table : sqlalchemy.sql.FromClause

SQLAlchemy object representing the logical table (which may be a join or subquery expression) to be joined.

dimensions : iterable of Dimension

The dimensions that relate this table to others that may be in the query. The table must have columns with the names of the dimensions.

columnNames : iterable of str

Names of the columns that correspond to dimension key values; must be zip iterable with dimensions.

Returns:
joinOn : list of sqlalchemy.sql.ColumnElement

Sequence of boolean expressions that should be combined with AND to form (part of) the ON expression for this JOIN.