SqlQueryContext¶

class lsst.daf.butler.registry.queries.SqlQueryContext(db: Database, column_types: ColumnTypeInfo, row_chunk_size: int = 1000)¶

Bases: QueryContext

An implementation of sql.QueryContext for SqlRegistry.

Parameters:

dbDatabase: Object that abstracts the database engine.
sql_engineButlerSqlEngine: Information about column types that can vary with registry configuration.
row_chunk_sizeint, optional: Number of rows to insert into temporary tables at once. If this is lower than db.get_constant_rows_max() it will be set to that value.

Attributes Summary

`column_types`	Information about column types that depend on registry configuration (`ColumnTypeInfo`).
`is_open`	Whether the context manager has been entered (`bool`).
`preferred_engine`	Return the relation engine that this context prefers to execute operations in (`lsst.daf.relation.Engine`).

Methods Summary

`any`(relation, *[, execute, exact])	Check whether this relation has any result rows at all.
`count`(relation, *[, exact, discard])	Count the number of rows in the given relation.
`drop_invalidated_postprocessing`(relation, ...)	Return a modified relation tree without iteration-engine operations that require columns that are not in the given set.
`fetch_iterable`(relation)	Execute the given relation and return its rows as an iterable of mappings.
`make_data_coordinate_predicate`(data_coordinate)	Return a `Predicate` that represents a data ID constraint.
`make_data_id_relation`(data_ids, dimension_names)	Transform a set of data IDs into a relation.
`make_initial_relation`([relation])	Construct an initial relation suitable for this context.
`make_spatial_region_overlap_predicate`(lhs, rhs)	Return a `Predicate` that tests whether two regions overlap.
`make_spatial_region_skypix_predicate`(...)	Return a `Predicate` that tests whether two region columns overlap.
`make_timespan_overlap_predicate`(tag, timespan)	Return a `Predicate` that tests whether a timespan column overlaps a timespan literal.
`materialize`(target, name)	Hook for implementing materialization operations.
`process`(relation)	Main entry point for processing a relation tree.
`restore_columns`(relation, columns_required)	Return a modified relation tree that attempts to restore columns that were dropped by a projection operation.
`strip_postprocessing`(relation)	Return a modified relation tree without any iteration-engine operations and any transfer to the iteration engine at the end.
`transfer`(source, destination, materialize_as)	Hook for implementing transfers between engines.

Attributes Documentation

column_types¶: Information about column types that depend on registry configuration (ColumnTypeInfo).

is_open¶

preferred_engine¶

Methods Documentation

any(relation: Relation, *, execute: bool = True, exact: bool = True) → bool¶

Check whether this relation has any result rows at all.

Parameters:

relationRelation: Relation to be checked.
executebool, optional: If True, execute at least a LIMIT 1 query if it cannot be determined prior to execution that the query would return no rows.
exactbool, optional: If True, run the full query and perform post-query filtering if needed, until at least one result row is found. If False, the returned result does not account for post-query filtering, and hence may be True even when all result rows would be filtered out.

Returns:

any_rowsbool: Whether the relation has any rows, or if it may have any rows if exact=False.

Raises:

RuntimeError: Raised if an exact check was requested and could not be obtained without executing the query.

count(relation: Relation, *, exact: bool = True, discard: bool = False) → int¶

Count the number of rows in the given relation.

Parameters:

relationRelation: Relation whose rows are to be counted.
exactbool, optional: If True (default), return the exact number of rows. If False, returning an upper bound is permitted if it can be done much more efficiently, e.g. by running a SQL SELECT COUNT(*) query but ignoring client-side filtering that would otherwise take place.
discardbool, optional: If True, compute the exact count even if it would require running the full query and then throwing away the result rows after counting them. If False, this is an error, as the user would usually be better off executing the query first to fetch its rows into a new query (or passing exact=False). Ignored if exact=False.

Returns:

n_rowsint: Number of rows in the relation, or an upper bound. This includes duplicates, if there are any.

Raises:

RuntimeError: Raised if an exact count was requested and could not be obtained without fetching and discarding rows.

drop_invalidated_postprocessing(relation: Relation, new_columns: Set[ColumnTag]) → Relation¶

Return a modified relation tree without iteration-engine operations that require columns that are not in the given set.

Parameters:

relationRelation: Original relation tree.
new_columnsSet [ ColumnTag ]: The only columns that postprocessing operations may require if they are to be retained in the returned relation tree.

Returns:

modifiedRelation: Modified relation tree with postprocessing operations incompatible with new_columns removed.

fetch_iterable(relation: Relation) → RowIterable¶

Execute the given relation and return its rows as an iterable of mappings.

Parameters:

relationRelation: Relation representing the query to execute.

Returns:

rowsRowIterable: An iterable over rows, with each row a mapping from ColumnTag to column value.

Notes

A transfer to iteration_engine will be added to the root (end) of the relation tree if the root is not already in the iteration engine.

Any transfers from other engines or persistent materializations will be handled by delegating to process before execution in the iteration engine.

To ensure the result is a multi-pass Python collection in memory, ensure the given tree ends with a materialization operation in the iteration engine.

make_data_coordinate_predicate(data_coordinate: DataCoordinate, full: bool | None = None) → Predicate¶

Return a Predicate that represents a data ID constraint.

Parameters:

data_coordinateDataCoordinate: Data ID whose keys and values should be transformed to predicate equality constraints.
fullbool, optional: Whether to include constraints on implied dimensions (default is to include implied dimensions if data_coordinate has them).

Returns:

predicatelsst.daf.relation.column_expressions.Predicate: New predicate

make_data_id_relation(data_ids: Set[DataCoordinate], dimension_names: Iterable[str]) → Relation¶

Transform a set of data IDs into a relation.

Parameters:

data_idsSet [ DataCoordinate ]: Data IDs to upload. All must have at least the dimensions given, but may have more.
dimension_namesIterable [ str ]: Names of dimensions that will be the columns of the relation.

Returns:

relationRelation: Relation in the iteration engine.

make_initial_relation(relation: Relation | None = None) → Relation¶

Construct an initial relation suitable for this context.

Parameters:

relationRelation, optional: A user-provided initial relation. Must be included by implementations when provided, but may be modified (e.g. by adding a transfer to a new engine) when need to satisfy the context’s invariants.

make_spatial_region_overlap_predicate(lhs: ColumnExpression, rhs: ColumnExpression) → Predicate¶

Return a Predicate that tests whether two regions overlap.

This operation only works with iteration engines; it is usually used to refine the result of a join or constraint on SkyPixDimension columns in SQL.

Parameters:

lhslsst.daf.relation.column_expressions.ColumnExpression: Expression for one spatial region.
rhslsst.daf.relation.column_expressions.ColumnExpression: Expression for the other spatial region.

Returns:

predicatelsst.daf.relation.column_expressions.Predicate: New predicate with lhs and rhs as its required columns.

make_spatial_region_skypix_predicate(dimension: SkyPixDimension, region: Region) → Predicate¶

Return a Predicate that tests whether two region columns overlap.

This operation only works with iteration engines; it is usually used to refine the result of a join on SkyPixDimension columns in SQL.

Parameters:

dimensionSkyPixDimension: Dimension whose key column is being constrained.
regionlsst.sphgeom.Region: Spatial region constraint to test against.

Returns:

predicatelsst.daf.relation.column_expressions.Predicate: New predicate with the DimensionKeyColumn associated with dimension as its only required column.

make_timespan_overlap_predicate(tag: ColumnTag, timespan: Timespan) → Predicate¶

Return a Predicate that tests whether a timespan column overlaps a timespan literal.

Parameters:

tagColumnTag: Identifier for a timespan column.
timespanTimespan: Timespan literal selected rows must overlap.

Returns:

predicatelsst.daf.relation.column_expressions.Predicate: New predicate.

materialize(target: Relation, name: str) → Any¶

Hook for implementing materialization operations.

This method should be called only by the Processor base class.

Parameters:

targetRelation: Relation to be materialized. Any upstream Transfer operations in this tree are guaranteed to already have a payload already attached (or some intervening relation does), so the relation’s own engine should be capable of processing it on its own.
namestr: The name of the Materialization operation, to be used as needed in the engine-specific payload.

Returns:

payload: Payload for this relation that should be cached.

process(relation: Relation) → Relation¶

Main entry point for processing a relation tree.

Parameters:

relationRelation: Root of the relation tree to process. On return, relations that hold a Materialization relation will have a new payload attached, if they did not have one already.

Returns:

processedRelation: A version of the relation tree in which any relation with a Transfer operation has a copy of the original Transfer that has a payload attached.

restore_columns(relation: Relation, columns_required: Set[ColumnTag]) → tuple[lsst.daf.relation._relation.Relation, set[lsst.daf.relation._columns._tag.ColumnTag]]¶

Return a modified relation tree that attempts to restore columns that were dropped by a projection operation.

Parameters:

relationRelation: Original relation tree.
columns_requiredSet [ ColumnTag ]: Columns to attempt to restore. May include columns already present in the relation.

Returns:

modifiedRelation: Possibly-modified tree with any projections that had dropped requested columns replaced by projections that do not drop these columns. Care is taken to ensure that join common columns and deduplication behavior is preserved, even if that means some columns are not restored.
columns_foundset [ ColumnTag ]: Columns from those requested that are present in modified.

strip_postprocessing(relation: Relation) → tuple[lsst.daf.relation._relation.Relation, list[lsst.daf.relation._unary_operation.UnaryOperation]]¶

Return a modified relation tree without any iteration-engine operations and any transfer to the iteration engine at the end.

Parameters:

relationRelation: Original relation tree.

Returns:

modifiedRelation: Stripped relation tree, with engine != iteration_engine.
strippedUnaryOperation: Operations that were stripped, in the same order they should be reapplied (with transfer=True, preferred_engine=iteration_engine) to recover the original tree.

transfer(source: Relation, destination: Engine, materialize_as: str | None) → Any¶

Hook for implementing transfers between engines.

This method should be called only by the Processor base class.

Parameters:

sourceRelation: Relation to be transferred. Any upstream Transfer operations in this tree are guaranteed to already have a payload already attached (or some intervening relation does), so the relation’s own engine should be capable of processing it on its own.
destinationEngine: Engine the relation is being transferred to.
materialize_asstr or None: If not None, the name of a Materialization operation that immediately follows the transfer being implemented, in which case the returned payload should be appropriate for caching with the Materialization.

Returns:

payload: Payload for this relation in the destination engine.

Navigation

SqlQueryContext¶