DataCoordinateQueryResults¶
-
class
lsst.daf.butler.registry.queries.DataCoordinateQueryResults(db: lsst.daf.butler.registry.interfaces._database.Database, query: lsst.daf.butler.registry.queries._query.Query, *, records: Optional[Mapping[str, Mapping[tuple, lsst.daf.butler.core.dimensions._records.DimensionRecord]]] = None)¶ Bases:
lsst.daf.butler.DataCoordinateIterableAn enhanced implementation of
DataCoordinateIterablethat represents data IDs retrieved from a database query.Parameters: - db :
Database Database engine used to execute queries.
- query :
Query Low-level representation of the query that backs this result object.
- records :
Mapping, optional A nested mapping containing
DimensionRecordobjects for all dimensions and all data IDs this query will yield. IfNone(default),DataCoordinateIterable.hasRecordswill returnFalse. The outer mapping hasstrkeys (the names of dimension elements). The inner mapping hastuplekeys representing data IDs (tuple conversions ofDataCoordinate.values()) andDimensionRecordvalues.
Notes
Constructing an instance of this does nothing; the query is not executed until it is iterated over (or some other operation is performed that involves iteration).
Instances should generally only be constructed by
Registrymethods or the methods of other query result objects.Attributes Summary
graphThe dimensions identified by these data IDs ( DimensionGraph).universeThe universe that defines all known dimensions compatible with this iterable ( DimensionUniverse).Methods Summary
constrain(query, columns, …)Constrain a SQL query to include or relate to only data IDs in this iterable. expanded()Return a results object for which hasRecordsreturnsTrue.findDatasets(datasetType, str], collections, …)Find datasets using the data IDs identified by this query. fromScalar(dataId)Return a DataCoordinateIterablecontaining the single data ID given.hasFull()Return whether all data IDs in this iterable identify all dimensions, not just required dimensions. hasRecords()Return whether all data IDs in this iterable contain DimensionRecordinstances.materialize()Insert this query’s results into a temporary table. subset(graph, *, unique)Return a results object containing a subset of the dimensions of this one, and/or a unique near-subset of its rows. toSequence()Transform this iterable into a DataCoordinateSequence.toSet()Transform this iterable into a DataCoordinateSet.Attributes Documentation
-
graph¶ The dimensions identified by these data IDs (
DimensionGraph).
-
universe¶ The universe that defines all known dimensions compatible with this iterable (
DimensionUniverse).
Methods Documentation
-
constrain(query: lsst.daf.butler.core.simpleQuery.SimpleQuery, columns: Callable[[str], sqlalchemy.sql.elements.ColumnElement]) → None¶ Constrain a SQL query to include or relate to only data IDs in this iterable.
Parameters: - query :
SimpleQuery Struct that represents the SQL query to constrain, either by appending to its WHERE clause, joining a new table or subquery, or both.
- columns :
Callable A callable that accepts
strdimension names and returns SQLAlchemy objects representing a column for that dimension’s primary key value in the query.
- query :
-
expanded() → lsst.daf.butler.registry.queries._results.DataCoordinateQueryResults¶ Return a results object for which
hasRecordsreturnsTrue.This method may involve actually executing database queries to fetch
DimensionRecordobjects.Returns: - results :
DataCoordinateQueryResults A results object for which
hasRecordsreturnsTrue. May beselfif that is already the case.
Notes
For very result sets, it may be much more efficient to call
materializebefore callingexpanded, to avoid performing the original query multiple times (as a subquery) in the follow-up queries that fetch dimension records. For example:with registry.queryDataIds(...).materialize() as tempDataIds: dataIdsWithRecords = tempDataIds.expanded() for dataId in dataIdsWithRecords: ...
- results :
-
findDatasets(datasetType: Union[lsst.daf.butler.core.datasets.type.DatasetType, str], collections: Any, *, findFirst: bool = True) → lsst.daf.butler.registry.queries._results.ParentDatasetQueryResults¶ Find datasets using the data IDs identified by this query.
Parameters: - datasetType :
DatasetTypeorstr Dataset type or the name of one to search for. Must have dimensions that are a subset of
self.graph.- collections :
Any An expression that fully or partially identifies the collections to search for the dataset, such as a
str,re.Pattern, or iterable thereof....can be used to return all collections. See Collection expressions for more information.- findFirst :
bool, optional If
True(default), for each result data ID, only yield oneDatasetRef, from the first collection in which a dataset of that dataset type appears (according to the order ofcollectionspassed in). IfTrue,collectionsmust not contain regular expressions and may not be....
Returns: - datasets :
ParentDatasetQueryResults A lazy-evaluation object representing dataset query results, iterable over
DatasetRefobjects. Ifself.hasRecords(), all nested data IDs in those dataset references will have records as well.
Raises: - ValueError
Raised if
datasetType.dimensions.issubset(self.graph) is False.
- datasetType :
-
static
fromScalar(dataId: lsst.daf.butler.core.dimensions._coordinate.DataCoordinate) → lsst.daf.butler.core.dimensions._dataCoordinateIterable._ScalarDataCoordinateIterable¶ Return a
DataCoordinateIterablecontaining the single data ID given.Parameters: - dataId :
DataCoordinate Data ID to adapt. Must be a true
DataCoordinateinstance, not an arbitrary mapping. No runtime checking is performed.
Returns: - iterable :
DataCoordinateIterable A
DataCoordinateIterableinstance of unspecified (i.e. implementation-detail) subclass. Guaranteed to implement thecollections.abc.Sized(i.e.__len__) andcollections.abc.Container(i.e.__contains__) interfaces as well as that ofDataCoordinateIterable.
- dataId :
-
hasFull() → bool¶ Return whether all data IDs in this iterable identify all dimensions, not just required dimensions.
Returns:
-
hasRecords() → bool¶ Return whether all data IDs in this iterable contain
DimensionRecordinstances.Returns:
-
materialize() → Iterator[lsst.daf.butler.registry.queries._results.DataCoordinateQueryResults]¶ Insert this query’s results into a temporary table.
Returns: - context :
typing.ContextManager[DataCoordinateQueryResults] A context manager that ensures the temporary table is created and populated in
__enter__(returning a results object backed by that table), and dropped in__exit__. Ifselfis already materialized, the context manager may do nothing (reflecting the fact that an outer context manager should already take care of everything else).
Notes
When using a very large result set to perform multiple queries (e.g. multiple calls to
subsetwith different arguments, or even a single call toexpanded), it may be much more efficient to start by materializing the query and only then performing the follow up queries. It may also be less efficient, depending on how well database engine’s query optimizer can simplify those particular follow-up queries and how efficiently it caches query results even when the are not explicitly inserted into a temporary table. Seeexpandedandsubsetfor examples.- context :
-
subset(graph: Optional[lsst.daf.butler.core.dimensions._graph.DimensionGraph] = None, *, unique: bool = False) → lsst.daf.butler.registry.queries._results.DataCoordinateQueryResults¶ Return a results object containing a subset of the dimensions of this one, and/or a unique near-subset of its rows.
This method may involve actually executing database queries to fetch
DimensionRecordobjects.Parameters: - graph :
DimensionGraph, optional Dimensions to include in the new results object. If
None,self.graphis used.- unique :
bool, optional If
True(Falseis default), the query should only return unique data IDs. This is implemented in the database; to obtain unique results via Python-side processing (which may be more efficient in some cases), usetoSetto construct aDataCoordinateSetfrom this results object instead.
Returns: - results :
DataCoordinateQueryResults A results object corresponding to the given criteria. May be
selfif it already qualifies.
Notes
This method can only return a “near-subset” of the original result rows in general because of subtleties in how spatial overlaps are implemented; see
Query.subsetfor more information.When calling
subsetmultiple times on the same very large result set, it may be much more efficient to callmaterializefirst. For example:dimensions1 = DimensionGraph(...) dimensions2 = DimensionGraph(...) with registry.queryDataIds(...).materialize() as tempDataIds: for dataId1 in tempDataIds.subset( graph=dimensions1, unique=True): ... for dataId2 in tempDataIds.subset( graph=dimensions2, unique=True): ...
- graph :
-
toSequence() → lsst.daf.butler.core.dimensions._dataCoordinateIterable.DataCoordinateSequence¶ Transform this iterable into a
DataCoordinateSequence.Returns: - seq :
DataCoordinateSequence A new
DatasetCoordinateSequencewith the same elements asself, in the same order. May beselfif it is already aDataCoordinateSequence.
- seq :
-
toSet() → lsst.daf.butler.core.dimensions._dataCoordinateIterable.DataCoordinateSet¶ Transform this iterable into a
DataCoordinateSet.Returns: - set :
DataCoordinateSet A
DatasetCoordinateSetinstance with the same elements asself, after removing any duplicates. May beselfif it is already aDataCoordinateSet.
- set :
- db :