Relation¶
- 
class lsst.daf.relation.Relation(*args, **kwargs)¶
- Bases: - typing.Protocol- An abstract interface for expression trees on tabular data. - See also - Notes - This ABC is a - typing.Protocol, which means that classes that implement its interface can be recognized as such by static type checkers without actually inheriting from it, and in fact all concrete relation types inherit only from- BaseRelation(which provides implementations of many- Relationmethods, but does not include the complete interface or inherit from- Relationitself) instead. This split allows subclasses to implement attributes that are defined as properties here as- dataclassattributes instead of true properties, something- typing.Protocolexplicitly permits and recommends that nevertheless works only if the protocol is not actually inherited from.- In almost all cases, users should use - Relationinstead of- BaseRelation: the only exception is when writing an- isinstancecheck to see if a type is a relation at all, rather than a particular relation subclass.- BaseRelationmay become an alias to- Relationitself in the future if- typing.Protocolinheritance interaction with properties is improved.- All concrete - Relationtypes are frozen, equality-comparable- dataclasses. They also provide a very concise- strrepresentation (in addition to the dataclass-provided- repr) suitable for summarizing an entire relation tree.- Attributes Summary - columns- The columns in this relation ( - Set[- ColumnTag] ).- engine- The engine that is responsible for interpreting this relation ( - Engine).- is_join_identity- Whether a - jointo this relation will result in the other relation being returned directly (- bool).- is_locked- Whether this relation and those upstream of it should be considered fixed by tree-manipulation algorithms ( - bool).- is_trivial- Whether this relation has no real content ( - bool).- max_rows- The maximum number of rows this relation might have ( - intor- None).- min_rows- The minimum number of rows this relation might have ( - int).- payload- The engine-specific contents of the relation. - Methods Summary - attach_payload(payload)- Attach an engine-specific - payloadto this relation.- chain(rhs)- Return a new relation with all rows from this relation and another. - join(rhs, predicate, *, backtrack, transfer)- Return a new relation that joins this one to the given one. - materialized(name, None] = None, *, name_prefix)- Return a new relation that indicates that this relation’s payload should be cached after it is first processed. - sorted(terms, *, preferred_engine, …)- Return a new relation that sorts rows according to a sequence of column expressions. - transferred_to(destination)- Return a new relation that transfers this relation to a new engine. - with_calculated_column(tag, expression, *, …)- Return a new relation that adds a calculated column to this one. - with_only_columns(columns, *, …)- Return a new relation whose columns are a subset of this relation’s. - with_rows_satisfying(predicate, *, …)- Return a new relation that filters out rows via a boolean expression. - without_duplicates(*, preferred_engine, …)- Return a new relation that removes any duplicate rows from this one. - Attributes Documentation - 
is_join_identity¶
- Whether a - jointo this relation will result in the other relation being returned directly (- bool).- Join identity relations have exactly one row and no columns. - See also 
 - 
is_locked¶
- Whether this relation and those upstream of it should be considered fixed by tree-manipulation algorithms ( - bool).
 - 
is_trivial¶
- Whether this relation has no real content ( - bool).- A trivial relation is either a - join identitywith no columns and exactly one row, or a relation with an arbitrary number of columns and no rows (i.e.- min_rows==max_rows==0).
 - 
max_rows¶
- The maximum number of rows this relation might have ( - intor- None).- This is - Nonefor relations whose size is not bounded from above.
 - 
payload¶
- The engine-specific contents of the relation. - This is - Nonein the common case that engine-specific contents are to be computed on-the-fly. Relation payloads permit “deferred initialization” - while relation objects are otherwise immutable, the payload may be set (once) after construction, via- attach_payload.
 - Methods Documentation - 
attach_payload(payload: Any) → None¶
- Attach an engine-specific - payloadto this relation.- This method may be called exactly once on a - Relationinstance that was not initialized with a- payload, despite the fact that- Relationobjects are otherwise considered immutable.- Parameters: - payload
- Engine-specific content to attach. 
 - Raises: - TypeError
- Raised if this relation already has a payload, or can never have a payload. - TypeErroris used here for consistency with other attempts to assign to an attribute of an immutable object.
 
 - 
chain(rhs: lsst.daf.relation._relation.Relation) → lsst.daf.relation._relation.Relation¶
- Return a new relation with all rows from this relation and another. - This is a convenience method that constructs and applies a - Chainoperation.- Parameters: - rhs : Relation
- Other relation to chain to - self. Must have the same columns and engine as- self.
 - Returns: - relation : Relation
- New relation with all rows from both relations. If the engine - preserves orderfor chains, all rows from- selfwill appear before all rows from- rhs, in their original order. This method never returns an operand directly, even if the other has- max_rows==0, as it is assumed that even relations with no rows are useful to preserve in the tree for- diagnostics.
 - Raises: - ColumnError
- Raised if the two relations do not have the same columns. 
- EngineError
- Raised if the two relations do not have the same engine. 
- RowOrderError
- Raised if - selfor- rhsis unnecessarily ordered; see- expect_unordered.
 
- rhs : 
 - 
join(rhs: Relation, predicate: Predicate | None = None, *, backtrack: bool = True, transfer: bool = False) → Relation¶
- Return a new relation that joins this one to the given one. - This is a convenience method that constructs and applies a - Joinoperation, via- PartialJoin.apply.- Parameters: - rhs : Relation
- Relation to join to - self.
- predicate : Predicate, optional
- Boolean expression that must evaluate to true in order to join a a pair of rows, in addition to an implicit equality constraint on any columns in both relations. 
- backtrack : bool, optional
- If - True(default) and- self.engine != rhs.engine, attempt to insert this join before a transfer upstream of- self, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and- self.engine != rhs.engine, insert a new- Transferbefore the- Join. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
 - Returns: - relation : Relation
- New relation that joins - selfto- rhs. May be- selfor- rhsif the other is a- join identity.
 - Raises: - ColumnError
- Raised if the given predicate requires columns not present in - selfor- rhs.
- EngineError
- Raised if it was impossible to insert this operation in - rhs.enginevia backtracks or transfers on- self, or if the predicate was not supported by the engine.
- RowOrderError
- Raised if - selfor- rhsis unnecessarily ordered; see- expect_unordered.
 - Notes - This method does not treat - selfand- rhssymmetrically: it always considers- rhsfixed, and only backtracks into or considers applying transfers to- self.
- rhs : 
 - 
materialized(name: Optional[str, None] = None, *, name_prefix: str = 'materialization') → lsst.daf.relation._relation.Relation¶
- Return a new relation that indicates that this relation’s payload should be cached after it is first processed. - This is a convenience method that constructs and applies a - Materializationoperation.- Parameters: - name : str, optional
- Name to use for the cached payload within the engine (e.g. the name for a temporary table in SQL). If not provided, a name will be created via a call to - Engine.get_relation_name.
- name_prefix : str, optional
- Prefix to pass to - Engine.get_relation_name; ignored if- nameis provided. Unlike most operations,- Materializationrelations are locked by default, since they reflect user intent to mark a specific tree as cacheable.
 - Returns: - relation : Relation
- New relation that marks its upstream tree for caching. May be - selfif it is already a- LeafRelationor another materialization (in which case the given name or name prefix will be ignored).
 - Raises: - See also 
- name : 
 - 
sorted(terms: Sequence[SortTerm], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶
- Return a new relation that sorts rows according to a sequence of column expressions. - This is a convenience method that constructs and applies a - Sortoperation.- Parameters: - terms : Sequence[SortTerm]
- Ordered sequence of column expressions to sort on, with whether to apply them in ascending or descending order. 
- preferred_engine : Engine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrack : bool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this sort before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Sort. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_engine : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 - Returns: - relation : Relation
- New relation with sorted rows. Will be - selfif- termsis empty. If- selfis already a sort operation relation, the operations will be merged by concatenating their terms, which may result in duplicate sort terms that have no effect.
 - Raises: 
- terms : 
 - 
transferred_to(destination: Engine) → Relation¶
- Return a new relation that transfers this relation to a new engine. - This is a convenience method that constructs and applies a - Transferoperation.- Parameters: - destination : Engine
- Engine for the new relation. 
 - Returns: - relation : Relation
- New relation in the given engine. Will be - selfif- self.engine == destination.
 - Raises: 
- destination : 
 - 
with_calculated_column(tag: ColumnTag, expression: ColumnExpression, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶
- Return a new relation that adds a calculated column to this one. - This is a convenience method chat constructs and applies a - Calculationoperation.- Parameters: - tag : ColumnTag
- Identifier for the new column. 
- expression : ColumnExpression
- Expression used to populate the new column. 
- preferred_engine : Engine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrack : bool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this calculation before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Calculation. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_engine : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 - Returns: - relation : Relation
- Relation that contains the calculated column. 
 - Raises: - ColumnError
- Raised if the expression requires columns that are not present in - self.columns, or if- tagis already present in- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.
 
- tag : 
 - 
with_only_columns(columns: Set[ColumnTag], *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶
- Return a new relation whose columns are a subset of this relation’s. - This is a convenience method that constructs and applies a - Projectionoperation.- Parameters: - columns : Set[ColumnTag]
- Columns to be propagated to the new relation; must be a subset of - self.columns.
- preferred_engine : Engine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrack : bool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this projection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Projection. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_engine : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 - Returns: - relation : Relation
- New relation with only the given columns. Will be - selfif- columns == self.columns.
 - Raises: - ColumnError
- Raised if - columnsis not a subset of- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine.
 
- columns : 
 - 
with_rows_satisfying(predicate: Predicate, *, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶
- Return a new relation that filters out rows via a boolean expression. - This is a convenience method that constructions and applies a - Selectionoperation.- Parameters: - predicate : Predicate
- Boolean expression that evaluates to - Falsefor rows that should be included and- Falsefor rows that should be filtered out.
- preferred_engine : Engine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrack : bool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this selection before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Selection. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_engine : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 - Returns: - relation : Relation
- New relation with only the rows that satisfy the given predicate. May be - selfif the predicate is- trivially True.
 - Raises: - ColumnError
- Raised if - predicate.columns_requiredis not a subset of- self.columns.
- EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine, or if the expression was not supported by the engine.
 
- predicate : 
 - 
without_duplicates(*, preferred_engine: Engine | None = None, backtrack: bool = True, transfer: bool = False, require_preferred_engine: bool = False) → Relation¶
- Return a new relation that removes any duplicate rows from this one. - This is a convenience method that constructs and applies a - Deduplicationoperation.- Parameters: - preferred_engine : Engine, optional
- Engine that the operation would ideally be performed in. If this is not equal to - self.engine, the- backtrack,- transfer, and- require_preferred_enginearguments control the behavior.
- backtrack : bool, optional
- If - True(default) and the current engine is not the preferred engine, attempt to insert this deduplication before a transfer upstream of the current relation, as long as this can be done without breaking up any locked relations or changing the resulting relation content.
- transfer : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, insert a new- Transferbefore the- Deduplication. If- backtrackis also true, the transfer is added only if the backtrack attempt fails.
- require_preferred_engine : bool, optional
- If - True(- Falseis default) and the current engine is not the preferred engine, raise- EngineError. If- backtrackis also true, the exception is only raised if the backtrack attempt fails. Ignored if- transferis true.
 - Returns: - relation : Relation
- Relation with no duplicate rows. This may be - selfif it can be determined that there is no duplication already, but this is not guaranteed.
 - Raises: - EngineError
- Raised if - require_preferred_engine=Trueand it was impossible to insert this operation in the preferred engine.
 
- preferred_engine : 
 
-