Query¶

class lsst.daf.butler.registry.queries.Query(*, connection: sqlalchemy.engine.base.Connection, sql: sqlalchemy.sql.selectable.FromClause, summary: lsst.daf.butler.registry.queries._structs.QuerySummary, columns: lsst.daf.butler.registry.queries._structs.QueryColumns, parameters: lsst.daf.butler.registry.queries._structs.QueryParameters)¶

Bases: object

A wrapper for a SQLAlchemy query that knows how to re-bind parameters and transform result rows into data IDs and dataset references.

A Query should almost always be constructed directly by a call to QueryBuilder.finish; direct construction will make it difficult to be able to maintain invariants between arguments (see the documentation for QueryColumns and QueryParameters for more information).

Parameters:

connection: `sqlalchemy.engine.Connection`: Connection used to execute the query.
sql : sqlalchemy.sql.FromClause: A complete SELECT query, including at least SELECT, FROM, and WHERE clauses.
summary : QuerySummary: Struct that organizes the dimensions involved in the query.
columns : QueryColumns: Columns that are referenced in the query in any clause.
parameters : QueryParameters: Bind parameters for the query.

Notes

SQLAlchemy is used in the public interface of Query rather than just its implementation simply because avoiding this would entail writing wrappers for the sqlalchemy.engine.RowProxy and sqlalchemy.engine.ResultProxy classes that are themselves generic wrappers for lower-level Python DBAPI classes. Another layer would entail another set of computational overheads, but the only reason we would seriously consider not using SQLAlchemy here in the future would be to reduce computational overheads.

Methods Summary

`bind`(dataId)	Return a dictionary that can be passed to a SQLAlchemy execute method to provide WHERE clause information at execution time rather than construction time.
`execute`(dataId)	Execute the query.
`extractDataId`(row, *, graph)	Extract a data ID from a result row.
`extractDatasetRef`(row, datasetType, dataId)	Extract a `DatasetRef` from a result row.
`predicate`(region)	Return a callable that can perform extra Python-side filtering of query results.

Methods Documentation

bind(dataId: lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate) → Dict[str, Any]¶

Return a dictionary that can be passed to a SQLAlchemy execute method to provide WHERE clause information at execution time rather than construction time.

Most callers should call Query.execute directly instead; when called with a data ID, that calls bind internally.

Parameters:	dataId : `ExpandedDataCoordinate` Data ID to transform into bind parameters. This must identify all dimensions in `QuerySummary.given`, and must have the same primary key values for all dimensions also identified by `QuerySummary.dataId`.
Returns:	parameters : `dict` Dictionary that can be passed as the second argument (with `self.sql` this first argument) to SQLAlchemy execute methods.

Notes

Calling bind does not automatically update the callable returned by predicate with the given data ID’s region (if it has one). That must be done manually by passing the region when calling predicate.

execute(dataId: Optional[lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate] = None) → sqlalchemy.engine.result.ResultProxy¶

Execute the query.

This may be called multiple times with different arguments to apply different bind parameter values without repeating the work of constructing the query.

Parameters:	dataId : `ExpandedDataCoordinate`, optional Data ID to transform into bind parameters. This must identify all dimensions in `QuerySummary.given`, and must have the same primary key values for all dimensions also identified by `QuerySummary.dataId`. If not provided, `QuerySummary.dataId` must identify all dimensions in `QuerySummary.given`.
Returns:	results : `sqlalchemy.engine.ResultProxy` Object representing the query results; see SQLAlchemy documentation for more information.

extractDataId(row: sqlalchemy.engine.result.RowProxy, *, graph: Optional[lsst.daf.butler.core.dimensions.graph.DimensionGraph] = None) → lsst.daf.butler.core.dimensions.coordinate.DataCoordinate¶

Extract a data ID from a result row.

Parameters:	row : `sqlalchemy.engine.RowProxy` A result row from a SQLAlchemy SELECT query. graph : `DimensionGraph`, optional The dimensions the returned data ID should identify. If not provided, this will be all dimensions in `QuerySummary.requested`.
Returns:	dataId : `DataCoordinate` A minimal data ID that identifies the requested dimensions but includes no metadata or implied dimensions.

extractDatasetRef(row: sqlalchemy.engine.result.RowProxy, datasetType: lsst.daf.butler.core.datasets.type.DatasetType, dataId: Optional[lsst.daf.butler.core.dimensions.coordinate.DataCoordinate] = None) → Tuple[lsst.daf.butler.core.datasets.ref.DatasetRef, Optional[int]]¶

Extract a DatasetRef from a result row.

Parameters:

row : sqlalchemy.engine.RowProxy: A result row from a SQLAlchemy SELECT query.
datasetType : DatasetType: Type of the dataset to extract. Must have been included in the Query via a call to QueryBuilder.joinDataset with isResult=True, or otherwise included in QueryColumns.datasets.
dataId : DataCoordinate: Data ID to attach to the DatasetRef. A minimal (i.e. base class) DataCoordinate is constructed from row if None.

Returns:

ref : DatasetRef: Reference to the dataset; guaranteed to have DatasetRef.id not None.
rank : int or None: Integer index of the collection in which this dataset was found, within the sequence of collections passed when constructing the query. None if QueryBuilder.joinDataset was called with addRank=False.

predicate(region: Optional[lsst.sphgeom.region.Region] = None) → Callable[[sqlalchemy.engine.result.RowProxy], bool]¶

Return a callable that can perform extra Python-side filtering of query results.

To get the expected results from a query, the returned predicate must be used to ignore rows for which it returns False; this permits the QueryBuilder implementation to move logic from the database to Python without changing the public interface.

Parameters:	region : `sphgeom.Region`, optional A region that any result-row regions must overlap in order for the predicate to return `True`. If not provided, this will be the region in `QuerySummary.dataId`, if there is one.
Returns:	func : `Callable` A callable that takes a single `sqlalchemy.engine.RowProxy` argmument and returns `bool`.

Navigation

Query¶