Materialization¶
- final class lsst.daf.relation.Materialization(target: Relation, payload: Any = None, *, name: str)¶
- Bases: - MarkerRelation- A marker operation that indicates that the upstream tree should be evaluated only once, with the results saved and reused for subsequent processing. - Materialization is the only provided operation for which - UnaryOperationRelation.is_lockeddefaults to- True.- Also unlike most operations, the - payloadvalue for a- Materializationif frequently not- None, as this is where engine-specific state is cached for future reuse.- Attributes Summary - The engine that is responsible for interpreting this relation ( - Engine).- Whether a - jointo this relation will result in the other relation being returned directly (- bool).- Whether this relation and those upstream of it should be considered fixed by tree-manipulation algorithms ( - bool).- Whether this relation has no real content ( - bool).- The maximum number of rows this relation might have ( - intor- None).- The minimum number of rows this relation might have ( - int).- The engine-specific contents of the relation. - Methods Summary - attach_payload(payload)- Attach an engine-specific - payloadto this relation.- chain(rhs)- Return a new relation with all rows from this relation and another. - join(rhs[, predicate, backtrack, transfer])- Return a new relation that joins this one to the given one. - materialized([name, name_prefix])- Return a new relation that indicates that this relation's payload should be cached after it is first processed. - reapply(target[, payload])- Mark a new target relation, returning a new instance of the same type. - simplify(target)- sorted(terms, *[, preferred_engine, ...])- Return a new relation that sorts rows according to a sequence of column expressions. - transferred_to(destination)- Return a new relation that transfers this relation to a new engine. - with_calculated_column(tag, expression, *[, ...])- Return a new relation that adds a calculated column to this one. - with_only_columns(columns, *[, ...])- Return a new relation whose columns are a subset of this relation's. - with_rows_satisfying(predicate, *[, ...])- Return a new relation that filters out rows via a boolean expression. - without_duplicates(*[, preferred_engine, ...])- Return a new relation that removes any duplicate rows from this one. - Attributes Documentation - is_join_identity¶
- Whether a - jointo this relation will result in the other relation being returned directly (- bool).- Join identity relations have exactly one row and no columns. - See also 
 - is_locked¶
 - is_trivial¶
- Whether this relation has no real content ( - bool).- A trivial relation is either a - join identitywith no columns and exactly one row, or a relation with an arbitrary number of columns and no rows (i.e.- min_rows==max_rows==0).
 - max_rows¶
- The maximum number of rows this relation might have ( - intor- None).- This is - Nonefor relations whose size is not bounded from above.
 - payload: Any = None¶
- The engine-specific contents of the relation. 
 - Methods Documentation - attach_payload(payload: Any) None¶
- Attach an engine-specific - payloadto this relation.- This method may be called exactly once on a - MarkerRelationinstance that was not initialized with a- payload, despite the fact that- Relationobjects are otherwise considered immutable.- Parameters:
- payload
- Engine-specific content to attach. 
 
- Raises:
- TypeError
- Raised if this relation already has a payload, or if this marker subclass can never have a payload. - TypeErroris used here for consistency with other attempts to assign to an attribute of an immutable object.
 
 
 - chain(rhs: Relation) Relation¶
- Return a new relation with all rows from this relation and another. - This is a convenience method that constructs and applies a - Chainoperation.- Parameters:
- rhsRelation
- Other relation to chain to - self. Must have the same columns and engine as- self.
 
- rhs
- Returns:
- relationRelation
- New relation with all rows from both relations. This method never returns an operand directly, even if the other has - max_rows==0, as it is assumed that even relations with no rows are useful to preserve in the tree for- diagnostics.
 
- relation
- Raises:
- ColumnError
- Raised if the two relations do not have the same columns. 
- EngineError
- Raised if the two relations do not have the same engine. 
 
 
 - join(rhs: Relation, predicate: Predicate | None = None, *, backtrack: bool = True, transfer: bool = False) Relation¶
- Return a new relation that joins this one to the given one. - This is a convenience method that constructs and applies a - Joinoperation, via- PartialJoin.apply.- Parameters:
- rhsRelation
- Relation to join to - self.
- predicatePredicate, optional
- Boolean expression that must evaluate to true in order to join a a pair of rows, in addition to an implicit equality constraint on any columns in both relations. 
- backtrackbool, optional
- If - True(default) and- self.engine != rhs.engine, attempt to insert this join before a transfer upstream of- self, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and- self.engine != rhs.engine, insert a new- Transferbefore the- Join. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
 
- rhs
- Returns:
- relationRelation
- New relation that joins - selfto- rhs. May be- selfor- rhsif the other is a- join identity.
 
- relation
- Raises:
- ColumnError
- Raised if the given predicate requires columns not present in - selfor- rhs.
- EngineError
- Raised if it was impossible to insert this operation in - rhs.enginevia backtracks or transfers on- self, or if the predicate was not supported by the engine.
 
 - Notes - This method does not treat - selfand- rhssymmetrically: it always considers- rhsfixed, and only backtracks into or considers applying transfers to- self.
 - materialized(name: str | None = None, *, name_prefix: str = 'materialization') Relation¶
- Return a new relation that indicates that this relation’s payload should be cached after it is first processed. - This is a convenience method that constructs and applies a - Materializationoperation.- Parameters:
- namestr, optional
- Name to use for the cached payload within the engine (e.g. the name for a temporary table in SQL). If not provided, a name will be created via a call to - Engine.get_relation_name.
- name_prefixstr, optional
- Prefix to pass to - Engine.get_relation_name; ignored if- nameis provided. Unlike most operations,- Materializationrelations are locked by default, since they reflect user intent to mark a specific tree as cacheable.
 
- name
- Returns:
- relationRelation
- New relation that marks its upstream tree for caching. May be - selfif it is already a- LeafRelationor another materialization (in which case the given name or name prefix will be ignored).
 
- relation
 - See also 
 - reapply(target: Relation, payload: Any | None = None) MarkerRelation¶
- Mark a new target relation, returning a new instance of the same type. - Parameters:
- targetRelation
- New relation to mark. 
- payload, optional
- Payload to attach to the new relation. 
 
- target
- Returns:
- relationMarkerRelation
- A new relation with the given target. 
 
- relation
 - Notes - This method is primarily intended for use by operations that “unroll” a relation tree to perform some modification upstream and then “replay” the operations and markers that were downstream. - MarkerRelationimplementations with state that depends on the target will need to override this method to update that state accordingly.
 - sorted(terms: Sequence[SortTerm], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) Relation¶
- Return a new relation that sorts rows according to a sequence of column expressions. - This is a convenience method that constructs and applies a - Sortoperation.- Parameters:
- termsSequence[SortTerm]
- Ordered sequence of column expressions to sort on, with whether to apply them in ascending or descending order. 
- preferred_engineEngine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrackbool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this sort before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Sort. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_enginebool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 
- terms
- Returns:
- relationRelation
- New relation with sorted rows. Will be - selfif- termsis empty. If- selfis already a sort operation relation, the operations will be merged by concatenating their terms, which may result in duplicate sort terms that have no effect.
 
- relation
- Raises:
 
 - transferred_to(destination: Engine) Relation¶
- Return a new relation that transfers this relation to a new engine. - This is a convenience method that constructs and applies a - Transferoperation.
 - with_calculated_column(tag: ColumnTag, expression: ColumnExpression, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) Relation¶
- Return a new relation that adds a calculated column to this one. - This is a convenience method chat constructs and applies a - Calculationoperation.- Parameters:
- tagColumnTag
- Identifier for the new column. 
- expressionColumnExpression
- Expression used to populate the new column. 
- preferred_engineEngine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrackbool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this calculation before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Calculation. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_enginebool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 
- tag
- Returns:
- relationRelation
- Relation that contains the calculated column. 
 
- relation
- Raises:
- ColumnError
- Raised if the expression requires columns that are not present in - self.columns, or if- tagis already present in- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.
 
 
 - with_only_columns(columns: Set[ColumnTag], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) Relation¶
- Return a new relation whose columns are a subset of this relation’s. - This is a convenience method that constructs and applies a - Projectionoperation.- Parameters:
- columnsSet[ColumnTag]
- Columns to be propagated to the new relation; must be a subset of - self.columns.
- preferred_engineEngine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrackbool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this projection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Projection. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_enginebool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 
- columns
- Returns:
- relationRelation
- New relation with only the given columns. Will be - selfif- columns == self.columns.
 
- relation
- Raises:
- ColumnError
- Raised if - columnsis not a subset of- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine.
 
 
 - with_rows_satisfying(predicate: Predicate, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) Relation¶
- Return a new relation that filters out rows via a boolean expression. - This is a convenience method that constructions and applies a - Selectionoperation.- Parameters:
- predicatePredicate
- Boolean expression that evaluates to - Falsefor rows that should be included and- Falsefor rows that should be filtered out.
- preferred_engineEngine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrackbool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this selection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Selection. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_enginebool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 
- predicate
- Returns:
- relationRelation
- New relation with only the rows that satisfy the given predicate. May be - selfif the predicate is- trivially True.
 
- relation
- Raises:
- ColumnError
- Raised if - predicate.columns_requiredis not a subset of- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.
 
 
 - without_duplicates(*, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) Relation¶
- Return a new relation that removes any duplicate rows from this one. - This is a convenience method that constructs and applies a - Deduplicationoperation.- Parameters:
- preferred_engineEngine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrackbool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this deduplication before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transferbool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Deduplication. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_enginebool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 
- preferred_engine
- Returns:
- relationRelation
- Relation with no duplicate rows. This may be - selfif it can be determined that there is no duplication already, but this is not guaranteed.
 
- relation
- Raises:
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine.