Native iteration (lsst.daf.relation.iteration)

The iteration module provides a simple Engine intended primarily to serve as a “final transfer destination” for relation trees that are mostly defined in other engines (e.g. sql), as a way to enable iteration over rows in Python and limited Python-side postprocessing. That can include:

  • applying predicates defined as regular Python callables;

  • concatenating, sorting, and deduplicating results in memory.

The iteration engine does not currently support join operations. The execute method is the main entry point for evaluating trees of relations that are purely in this engine, transforming them into its payload type, RowIterable, which represents a row as a collections.abc.Mapping with ColumnTag keys. All operations other than Reordering preserve order.

This engine supports “backtracking insertion”, in which an operation that is logically appended to an iteration-engine relation is actually inserted in a different upstream engine, as long as it commutes with the intervening iteration-engine operations. This is enabled by passing the preferred_engine argument to UnaryOperation.apply or the various Relation convenience methods that forward to it, with backtrack=True (the default).

Generally, execution is lazy; operations are performed row-by-row by returning RowIterable (the engine’s payload type) instances backed by generators, with a few exceptions: In particular:

All other operations provided by the lsst.daf.relation package are guaranteed to only iterate once over their targets, and yield results row-by-row. Custom unary operations can be supported by implementing Engine.apply_custom_unary_operation.

API reference

Classes

CalculationRowIterable(target, tag, callable)

A RowIterable implementation that implements a calculation operation.

ChainRowIterable(chain)

A RowIterable implementation that wraps itertools.chain.

Engine(*, name, functions, _F] =, ...)

A concrete engine that treats relations as iterables with Mapping rows.

MaterializedRowIterable()

A RowIterable that is not lazy and has a known length.

ProjectionRowIterable(target, columns)

A RowIterable implementation that implements a projection operation.

RowIterable()

An abstract base class for iterables that use mappings for rows.

RowMapping(unique_key, rows)

A RowIterable backed by a Mapping.

RowSequence(rows)

A RowIterable backed by a Sequence.

SelectionRowIterable(target, callable)

A RowIterable implementation that implements a selection operation.