Materialization¶

final class lsst.daf.relation.Materialization(target: Relation, payload: Any = None, *, name: str)¶

Bases: MarkerRelation

A marker operation that indicates that the upstream tree should be evaluated only once, with the results saved and reused for subsequent processing.

Materialization is the only provided operation for which UnaryOperationRelation.is_locked defaults to True.

Also unlike most operations, the payload value for a Materialization if frequently not None, as this is where engine-specific state is cached for future reuse.

Attributes Summary

`columns`	The columns in this relation (`Set` [ `ColumnTag` ] ).
`engine`	The engine that is responsible for interpreting this relation (`Engine`).
`is_join_identity`	Whether a `join` to this relation will result in the other relation being returned directly (`bool`).
`is_locked`	Whether this relation and those upstream of it should be considered fixed by tree-manipulation algorithms (`bool`).
`is_trivial`	Whether this relation has no real content (`bool`).
`max_rows`	The maximum number of rows this relation might have (`int` or `None`).
`min_rows`	The minimum number of rows this relation might have (`int`).
`payload`	The engine-specific contents of the relation.

Methods Summary

`attach_payload`(payload)	Attach an engine-specific `payload` to this relation.
`chain`(rhs)	Return a new relation with all rows from this relation and another.
`join`(rhs[, predicate, backtrack, transfer])	Return a new relation that joins this one to the given one.
`materialized`([name, name_prefix])	Return a new relation that indicates that this relation's payload should be cached after it is first processed.
`reapply`(target[, payload])	Mark a new target relation, returning a new instance of the same type.
`simplify`(target)
`sorted`(terms, *[, preferred_engine, ...])	Return a new relation that sorts rows according to a sequence of column expressions.
`transferred_to`(destination)	Return a new relation that transfers this relation to a new engine.
`with_calculated_column`(tag, expression, *[, ...])	Return a new relation that adds a calculated column to this one.
`with_only_columns`(columns, *[, ...])	Return a new relation whose columns are a subset of this relation's.
`with_rows_satisfying`(predicate, *[, ...])	Return a new relation that filters out rows via a boolean expression.
`without_duplicates`(*[, preferred_engine, ...])	Return a new relation that removes any duplicate rows from this one.

Attributes Documentation

columns¶: The columns in this relation (Set [ ColumnTag ] ).

engine¶: The engine that is responsible for interpreting this relation (Engine).

is_join_identity¶

Whether a join to this relation will result in the other relation being returned directly (bool).

Join identity relations have exactly one row and no columns.

See also

Processor.materialize

reapply(target: Relation, payload: Any | None = None) → MarkerRelation¶

Mark a new target relation, returning a new instance of the same type.

Parameters:

targetRelation: New relation to mark.
payload, optional: Payload to attach to the new relation.

Returns:

relationMarkerRelation: A new relation with the given target.

Notes

This method is primarily intended for use by operations that “unroll” a relation tree to perform some modification upstream and then “replay” the operations and markers that were downstream. MarkerRelation implementations with state that depends on the target will need to override this method to update that state accordingly.

classmethod simplify(target: Relation) → bool¶

sorted(terms: Sequence[SortTerm], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶

Return a new relation that sorts rows according to a sequence of column expressions.

This is a convenience method that constructs and applies a Sort operation.

Parameters:

termsSequence [ SortTerm ]: Ordered sequence of column expressions to sort on, with whether to apply them in ascending or descending order.
preferred_engineEngine, optional: Engine that the operation would ideally be performed in. If this is not equal to self.engine, the backtrack, transfer, and require_preferred_engine arguments control the behavior.
backtrackbool, optional: If True (default) and the current engine is not the preferred engine, attempt to insert this sort before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
transferbool, optional: If True (False is default) and the current engine is not the preferred engine, insert a new Transfer before the Sort. If backtrack is also true, the transfer is added only if the backtrack attempt fails.
require_preferred_enginebool, optional: If True (False is default) and the current engine is not the preferred engine, raise EngineError. If backtrack is also true, the exception is only raised if the backtrack attempt fails. Ignored if transfer is true.

Returns:

relationRelation: New relation with sorted rows. Will be self if terms is empty. If self is already a sort operation relation, the operations will be merged by concatenating their terms, which may result in duplicate sort terms that have no effect.

Raises:

ColumnError: Raised if any column required by a SortTerm is not present in self.columns.
EngineError: Raised if require_preferred_engine=True and it was impossible to insert this operation in the preferred engine, or if a SortTerm expression was not supported by the engine.

transferred_to(destination: Engine) → Relation¶

Return a new relation that transfers this relation to a new engine.

This is a convenience method that constructs and applies a Transfer operation.

Parameters:

destinationEngine: Engine for the new relation.

Returns:

relationRelation: New relation in the given engine. Will be self if self.engine == destination.

Raises:

with_calculated_column(tag: ColumnTag, expression: ColumnExpression, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶

Return a new relation that adds a calculated column to this one.

This is a convenience method chat constructs and applies a Calculation operation.

Parameters:

tagColumnTag: Identifier for the new column.
expressionColumnExpression: Expression used to populate the new column.
preferred_engineEngine, optional: Engine that the operation would ideally be performed in. If this is not equal to self.engine, the backtrack, transfer, and require_preferred_engine arguments control the behavior.
backtrackbool, optional: If True (default) and the current engine is not the preferred engine, attempt to insert this calculation before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
transferbool, optional: If True (False is default) and the current engine is not the preferred engine, insert a new Transfer before the Calculation. If backtrack is also true, the transfer is added only if the backtrack attempt fails.
require_preferred_enginebool, optional: If True (False is default) and the current engine is not the preferred engine, raise EngineError. If backtrack is also true, the exception is only raised if the backtrack attempt fails. Ignored if transfer is true.

Returns:

relationRelation: Relation that contains the calculated column.

Raises:

ColumnError: Raised if the expression requires columns that are not present in self.columns, or if tag is already present in self.columns.
EngineError: Raised if require_preferred_engine=True and it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.

with_only_columns(columns: Set[ColumnTag], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶

Return a new relation whose columns are a subset of this relation’s.

This is a convenience method that constructs and applies a Projection operation.

Parameters:

columnsSet [ ColumnTag ]: Columns to be propagated to the new relation; must be a subset of self.columns.
preferred_engineEngine, optional: Engine that the operation would ideally be performed in. If this is not equal to self.engine, the backtrack, transfer, and require_preferred_engine arguments control the behavior.
backtrackbool, optional: If True (default) and the current engine is not the preferred engine, attempt to insert this projection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
transferbool, optional: If True (False is default) and the current engine is not the preferred engine, insert a new Transfer before the Projection. If backtrack is also true, the transfer is added only if the backtrack attempt fails.
require_preferred_enginebool, optional: If True (False is default) and the current engine is not the preferred engine, raise EngineError. If backtrack is also true, the exception is only raised if the backtrack attempt fails. Ignored if transfer is true.

Returns:

relationRelation: New relation with only the given columns. Will be self if columns == self.columns.

Raises:

ColumnError: Raised if columns is not a subset of self.columns.
EngineError: Raised if require_preferred_engine=True and it was impossible to insert this operation in the preferred engine.

with_rows_satisfying(predicate: Predicate, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶

Return a new relation that filters out rows via a boolean expression.

This is a convenience method that constructions and applies a Selection operation.

Parameters:

predicatePredicate: Boolean expression that evaluates to False for rows that should be included and False for rows that should be filtered out.
preferred_engineEngine, optional: Engine that the operation would ideally be performed in. If this is not equal to self.engine, the backtrack, transfer, and require_preferred_engine arguments control the behavior.
backtrackbool, optional: If True (default) and the current engine is not the preferred engine, attempt to insert this selection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
transferbool, optional: If True (False is default) and the current engine is not the preferred engine, insert a new Transfer before the Selection. If backtrack is also true, the transfer is added only if the backtrack attempt fails.
require_preferred_enginebool, optional: If True (False is default) and the current engine is not the preferred engine, raise EngineError. If backtrack is also true, the exception is only raised if the backtrack attempt fails. Ignored if transfer is true.

Returns:

relationRelation: New relation with only the rows that satisfy the given predicate. May be self if the predicate is trivially True.

Raises:

ColumnError: Raised if predicate.columns_required is not a subset of self.columns.
EngineError: Raised if require_preferred_engine=True and it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.

without_duplicates(*, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶

Return a new relation that removes any duplicate rows from this one.

This is a convenience method that constructs and applies a Deduplication operation.

Parameters:

preferred_engineEngine, optional: Engine that the operation would ideally be performed in. If this is not equal to self.engine, the backtrack, transfer, and require_preferred_engine arguments control the behavior.
backtrackbool, optional: If True (default) and the current engine is not the preferred engine, attempt to insert this deduplication before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
transferbool, optional: If True (False is default) and the current engine is not the preferred engine, insert a new Transfer before the Deduplication. If backtrack is also true, the transfer is added only if the backtrack attempt fails.
require_preferred_enginebool, optional: If True (False is default) and the current engine is not the preferred engine, raise EngineError. If backtrack is also true, the exception is only raised if the backtrack attempt fails. Ignored if transfer is true.

Returns:

relationRelation: Relation with no duplicate rows. This may be self if it can be determined that there is no duplication already, but this is not guaranteed.

Raises:

EngineError: Raised if require_preferred_engine=True and it was impossible to insert this operation in the preferred engine.

Navigation

Materialization¶