Native iteration (lsst.daf.relation.iteration
)¶
The iteration
module provides a simple Engine
intended primarily to serve as a “final transfer destination” for relation trees that are mostly defined in other engines (e.g. sql
), as a way to enable iteration over rows in Python and limited Python-side postprocessing.
That can include:
- applying predicates defined as regular Python callables;
- concatenating, sorting, and deduplicating results in memory.
The iteration engine does not currently support join operations.
The execute
method is the main entry point for evaluating trees of relations that are purely in this engine, transforming them into its payload type, RowIterable
, which represents a row as a collections.abc.Mapping
with ColumnTag
keys.
All operations other than Reordering
preserve order.
This engine supports “backtracking insertion”, in which an operation that is logically appended to an iteration-engine relation is actually inserted in a different upstream engine, as long as it commutes with the intervening iteration-engine operations.
This is enabled by passing the preferred_engine
argument to UnaryOperation.apply
or the various Relation
convenience methods that forward to it, with backtrack=True
(the default).
Generally, execution is lazy; operations are performed row-by-row by returning RowIterable
(the engine’s payload
type) instances backed by generators, with a few exceptions:
In particular:
Deduplication
operations gather all unique rows into adict
(inside aRowMapping
);Sort
operations gather all rows into alist
(inside aRowSequence
);Materialization
operations gather all rows into alist
unless they are already in adict
orlist
(viaRowIterable.materialized
);Slice
operations on aRowSequence
are computed directly, creating anotherRowSequence
(all other slices are lazy).
All other operations provided by the lsst.daf.relation
package are guaranteed to only iterate once over their targets, and yield results row-by-row.
Custom unary operations can be supported by implementing Engine.apply_custom_unary_operation
.
API reference¶
Classes¶
CalculationRowIterable (target, tag, …) |
A RowIterable implementation that implements a calculation operation. |
ChainRowIterable (chain) |
A RowIterable implementation that wraps itertools.chain . |
Engine (*, name, functions, _F] = <factory>, …) |
A concrete engine that treats relations as iterables with Mapping rows. |
MaterializedRowIterable |
A RowIterable that is not lazy and has a known length. |
ProjectionRowIterable (target, columns) |
A RowIterable implementation that implements a projection operation. |
RowIterable |
An abstract base class for iterables that use mappings for rows. |
RowMapping (unique_key, rows, …) |
A RowIterable backed by a Mapping |
RowSequence (rows, Any]]) |
A RowIterable backed by a Sequence |
SelectionRowIterable (target, callable, Any], …) |
A RowIterable implementation that implements a selection operation. |