SqlQueryContext¶
- class lsst.daf.butler.registry.queries.SqlQueryContext(db: Database, column_types: ColumnTypeInfo, row_chunk_size: int = 1000)¶
Bases:
QueryContext
An implementation of
sql.QueryContext
forSqlRegistry
.- Parameters:
- db
Database
Object that abstracts the database engine.
- sql_engine
ButlerSqlEngine
Information about column types that can vary with registry configuration.
- row_chunk_size
int
, optional Number of rows to insert into temporary tables at once. If this is lower than
db.get_constant_rows_max()
it will be set to that value.
- db
Attributes Summary
Information about column types that depend on registry configuration (
ColumnTypeInfo
).Whether the context manager has been entered (
bool
).Return the relation engine that this context prefers to execute operations in (
lsst.daf.relation.Engine
).Methods Summary
any
(relation, *[, execute, exact])Check whether this relation has any result rows at all.
count
(relation, *[, exact, discard])Count the number of rows in the given relation.
drop_invalidated_postprocessing
(relation, ...)Return a modified relation tree without iteration-engine operations that require columns that are not in the given set.
fetch_iterable
(relation)Execute the given relation and return its rows as an iterable of mappings.
make_data_coordinate_predicate
(data_coordinate)Return a
Predicate
that represents a data ID constraint.make_data_id_relation
(data_ids, dimension_names)Transform a set of data IDs into a relation.
make_initial_relation
([relation])Construct an initial relation suitable for this context.
make_spatial_region_overlap_predicate
(lhs, rhs)Return a
Predicate
that tests whether two regions overlapReturn a
Predicate
that tests whether two region columns overlapmake_timespan_overlap_predicate
(tag, timespan)Return a
Predicate
that tests whether a timespan column overlaps a timespan literal.materialize
(target, name)Hook for implementing materialization operations.
process
(relation)Main entry point for processing a relation tree.
restore_columns
(relation, columns_required)Return a modified relation tree that attempts to restore columns that were dropped by a projection operation.
strip_postprocessing
(relation)Return a modified relation tree without any iteration-engine operations and any transfer to the iteration engine at the end.
transfer
(source, destination, materialize_as)Hook for implementing transfers between engines.
Attributes Documentation
- column_types¶
Information about column types that depend on registry configuration (
ColumnTypeInfo
).
- is_open¶
- preferred_engine¶
Methods Documentation
- any(relation: Relation, *, execute: bool = True, exact: bool = True) bool ¶
Check whether this relation has any result rows at all.
- Parameters:
- relation
Relation
Relation to be checked.
- execute
bool
, optional If
True
, execute at least aLIMIT 1
query if it cannot be determined prior to execution that the query would return no rows.- exact
bool
, optional If
True
, run the full query and perform post-query filtering if needed, until at least one result row is found. IfFalse
, the returned result does not account for post-query filtering, and hence may beTrue
even when all result rows would be filtered out.
- relation
- Returns:
- any_rows
bool
Whether the relation has any rows, or if it may have any rows if
exact=False
.
- any_rows
- Raises:
- RuntimeError
Raised if an exact check was requested and could not be obtained without executing the query.
- count(relation: Relation, *, exact: bool = True, discard: bool = False) int ¶
Count the number of rows in the given relation.
- Parameters:
- relation
Relation
Relation whose rows are to be counted.
- exact
bool
, optional If
True
(default), return the exact number of rows. IfFalse
, returning an upper bound is permitted if it can be done much more efficiently, e.g. by running a SQLSELECT COUNT(*)
query but ignoring client-side filtering that would otherwise take place.- discard
bool
, optional If
True
, compute the exact count even if it would require running the full query and then throwing away the result rows after counting them. IfFalse
, this is an error, as the user would usually be better off executing the query first to fetch its rows into a new query (or passingexact=False
). Ignored ifexact=False
.
- relation
- Returns:
- n_rows
int
Number of rows in the relation, or an upper bound. This includes duplicates, if there are any.
- n_rows
- Raises:
- RuntimeError
Raised if an exact count was requested and could not be obtained without fetching and discarding rows.
- drop_invalidated_postprocessing(relation: Relation, new_columns: Set[ColumnTag]) Relation ¶
Return a modified relation tree without iteration-engine operations that require columns that are not in the given set.
- Parameters:
- relation
Relation
Original relation tree.
- new_columns
Set
[ColumnTag
] The only columns that postprocessing operations may require if they are to be retained in the returned relation tree.
- relation
- Returns:
- modified
Relation
Modified relation tree with postprocessing operations incompatible with
new_columns
removed.
- modified
- fetch_iterable(relation: Relation) RowIterable ¶
Execute the given relation and return its rows as an iterable of mappings.
- Parameters:
- relation
Relation
Relation representing the query to execute.
- relation
- Returns:
- rows
RowIterable
An iterable over rows, with each row a mapping from
ColumnTag
to column value.
- rows
Notes
A transfer to
iteration_engine
will be added to the root (end) of the relation tree if the root is not already in the iteration engine.Any transfers from other engines or persistent materializations will be handled by delegating to
process
before execution in the iteration engine.To ensure the result is a multi-pass Python collection in memory, ensure the given tree ends with a materialization operation in the iteration engine.
- make_data_coordinate_predicate(data_coordinate: DataCoordinate, full: bool | None = None) Predicate ¶
Return a
Predicate
that represents a data ID constraint.- Parameters:
- data_coordinate
DataCoordinate
Data ID whose keys and values should be transformed to predicate equality constraints.
- full
bool
, optional Whether to include constraints on implied dimensions (default is to include implied dimensions if
data_coordinate
has them).
- data_coordinate
- Returns:
- predicate
lsst.daf.relation.column_expressions.Predicate
New predicate
- predicate
- make_data_id_relation(data_ids: Set[DataCoordinate], dimension_names: Iterable[str]) Relation ¶
Transform a set of data IDs into a relation.
Parameters¶
- data_ids
Set
[DataCoordinate
] Data IDs to upload. All must have at least the dimensions given, but may have more.
- dimension_names
Iterable
[str
] Names of dimensions that will be the columns of the relation.
- Returns:
- relation
Relation
Relation in the iteration engine.
- relation
- data_ids
- make_initial_relation(relation: Relation | None = None) Relation ¶
Construct an initial relation suitable for this context.
- Parameters:
- relation
Relation
, optional A user-provided initial relation. Must be included by implementations when provided, but may be modified (e.g. by adding a transfer to a new engine) when need to satisfy the context’s invariants.
- relation
- make_spatial_region_overlap_predicate(lhs: ColumnExpression, rhs: ColumnExpression) Predicate ¶
Return a
Predicate
that tests whether two regions overlapThis operation only works with
iteration engines
; it is usually used to refine the result of a join or constraint onSkyPixDimension
columns in SQL.Parameters¶
- lhs
lsst.daf.relation.column_expressions.ColumnExpression
Expression for one spatial region.
- rhs
lsst.daf.relation.column_expressions.ColumnExpression
Expression for the other spatial region.
- Returns:
- predicate
lsst.daf.relation.column_expressions.Predicate
New predicate with
lhs
andrhs
as its required columns.
- predicate
- lhs
- make_spatial_region_skypix_predicate(dimension: SkyPixDimension, region: Region) Predicate ¶
Return a
Predicate
that tests whether two region columns overlapThis operation only works with
iteration engines
; it is usually used to refine the result of a join onSkyPixDimension
columns in SQL.Parameters¶
- dimension
SkyPixDimension
Dimension whose key column is being constrained.
- region
lsst.sphgeom.Region
Spatial region constraint to test against.
- Returns:
- predicate
lsst.daf.relation.column_expressions.Predicate
New predicate with the
DimensionKeyColumn
associated withdimension
as its only required column.
- predicate
- dimension
- make_timespan_overlap_predicate(tag: ColumnTag, timespan: Timespan) Predicate ¶
Return a
Predicate
that tests whether a timespan column overlaps a timespan literal.- Parameters:
- tag
ColumnTag
Identifier for a timespan column.
- timespan
Timespan
Timespan literal selected rows must overlap.
- tag
- Returns:
- predicate
lsst.daf.relation.column_expressions.Predicate
New predicate.
- predicate
- materialize(target: Relation, name: str) Any ¶
Hook for implementing materialization operations.
This method should be called only by the
Processor
base class.- Parameters:
- target
Relation
Relation to be materialized. Any upstream
Transfer
operations in this tree are guaranteed to already have apayload
already attached (or some intervening relation does), so the relation’s own engine should be capable of processing it on its own.- name
str
The name of the
Materialization
operation, to be used as needed in the engine-specific payload.
- target
- Returns:
- payload
Payload for this relation that should be cached.
- process(relation: Relation) Relation ¶
Main entry point for processing a relation tree.
- Parameters:
- relation
Relation
Root of the relation tree to process. On return, relations that hold a
Materialization
relation will have a newpayload
attached, if they did not have one already.
- relation
- Returns:
- processed
Relation
A version of the relation tree in which any relation with a
Transfer
operation has a copy of the originalTransfer
that has apayload
attached.
- processed
- restore_columns(relation: Relation, columns_required: Set[ColumnTag]) tuple[lsst.daf.relation._relation.Relation, set[lsst.daf.relation._columns._tag.ColumnTag]] ¶
Return a modified relation tree that attempts to restore columns that were dropped by a projection operation.
- Parameters:
- relation
Relation
Original relation tree.
- columns_required
Set
[ColumnTag
] Columns to attempt to restore. May include columns already present in the relation.
- relation
- Returns:
- modified
Relation
Possibly-modified tree with any projections that had dropped requested columns replaced by projections that do not drop these columns. Care is taken to ensure that join common columns and deduplication behavior is preserved, even if that means some columns are not restored.
- columns_found
set
[ColumnTag
] Columns from those requested that are present in
modified
.
- modified
- strip_postprocessing(relation: Relation) tuple[lsst.daf.relation._relation.Relation, list[lsst.daf.relation._unary_operation.UnaryOperation]] ¶
Return a modified relation tree without any iteration-engine operations and any transfer to the iteration engine at the end.
- Parameters:
- relation
Relation
Original relation tree.
- relation
- Returns:
- modified
Relation
Stripped relation tree, with engine !=
iteration_engine
.- stripped
UnaryOperation
Operations that were stripped, in the same order they should be reapplied (with
transfer=True, preferred_engine=iteration_engine
) to recover the original tree.
- modified
- transfer(source: Relation, destination: Engine, materialize_as: str | None) Any ¶
Hook for implementing transfers between engines.
This method should be called only by the
Processor
base class.- Parameters:
- source
Relation
Relation to be transferred. Any upstream
Transfer
operations in this tree are guaranteed to already have apayload
already attached (or some intervening relation does), so the relation’s own engine should be capable of processing it on its own.- destination
Engine
Engine the relation is being transferred to.
- materialize_as
str
orNone
If not
None
, the name of aMaterialization
operation that immediately follows the transfer being implemented, in which case the returnedpayload
should be appropriate for caching with theMaterialization
.
- source
- Returns:
- payload
Payload for this relation in the
destination
engine.