SQLAlchemy query generation (lsst.daf.relation.sql
)¶
Overview¶
The Engine
provided by the lsst.daf.relation.sql
package translates Relation
trees to SQLAlchemy expressions.
It doesn’t actually execute those expressions, however; running a SQL query and fetching its results back to a Python client is really best considered a Transfer
between engines, with a SQL engine like this one as the source and an engine backed by in-memory Python objects (such as lsst.daf.relation.iteration
) as its destination.
Multi-engine relation trees can be executed by a Processor
subclass.
This engine flattens back-to-back Join
and Chain
operations, and it actually reorders any combination of adjacent Join
, Selection
, Calculation
, and Projection
into a single SELECT...FROM...WHERE
statement via the commutation rules for these operations.
It assumes the database query optimizer will reorder at least these operations itself anyway, so the goal is to keep the query as simple as possible to stay out of its way and aid human readers.
The Engine.to_executable
method transforms a Relation
tree to a SQL SELECT
or UNION
thereof, as represented by SQLAlchemy.
The engine’s payload type is the Payload
dataclass
, which better maps to SQL tables or very simple subqueries that can be used like tables.
Support for custom UnaryOperation
subclasses can be added by reimplementing Engine.to_payload
while delegating to super
for the standard operation classes.
Logical Columns¶
The Engine
and Payload
classes are generic over a parameter we refer to as the “logical column type”.
In the simplest case, the logical column type is just sqlalchemy.sql.ColumnElement
, and this is what the default implementations of most Engine
methods assume.
Custom subclasses of the SQL Engine
can use other types, such as wrappers that hold one or more sqlalchemy.sql.ColumnElement
objects, as long as they override a few Engine
methods to handle them.
This allows one column tag and logical column in the Relation
representation of a query to map to multiple columns in the SQL representation.
Operation Ordering and Select
¶
The SQL engine has a non-trivial Engine.conform
method and overrides other methods to keep relation trees in a certain order when applying new operations.
It uses a custom Select
marker type to partition the tree into subqueries and mark a tree as conformed.
This order attempts to avoid subqueries whenever possible by pulling Projection
, Deduplication
, Sort
, and Slice
operations downstream when commutation rules permit, both to keep the SQL output simple and to reduce the loss of row-ordering imposed by Sort
operations whenever possible (i.e. a Sort
can only be applied in the outermost query if it is to have any effect).
API reference¶
Classes¶
|
A concrete engine class for relations backed by a SQL database. |
|
A struct that represents a SQL table or simple |
|
A marker operation used by a the SQL engine to group relation trees into SELECT statements. |