Registry¶
- class lsst.daf.butler.Registry¶
Bases:
ABCAbstract Registry interface.
All subclasses should store
RegistryDefaultsin a_defaultsproperty. No other properties are assumed shared between implementations.Attributes Summary
Default collection search path and/or output
RUNcollection (RegistryDefaults).Definitions of all dimensions recognized by this
Registry(DimensionUniverse).The ObsCore manager instance for this registry (
ObsCoreTableManagerorNone).Methods Summary
associate(collection, refs)Add existing datasets to a
TAGGEDcollection.Context manager that enables caching.
certify(collection, refs, timespan)Associate one or more datasets with a calibration collection and a validity range within it.
decertify(collection, datasetType, timespan, *)Remove or adjust datasets to clear a validity range within a calibration collection.
disassociate(collection, refs)Remove existing datasets from a
TAGGEDcollection.expandDataId([dataId, dimensions, graph, ...])Expand a dimension-based data ID to include additional information.
findDataset(datasetType[, dataId, ...])Find a dataset given its
DatasetTypeand data ID.getCollectionChain(parent)Return the child collections in a
CHAINEDcollection.getCollectionDocumentation(collection)Retrieve the documentation string for a collection.
getCollectionParentChains(collection)Return the CHAINED collections that directly contain the given one.
getCollectionSummary(collection)Return a summary for the given collection.
getCollectionType(name)Return an enumeration value indicating the type of the given collection.
getDataset(id)Retrieve a Dataset entry.
getDatasetLocations(ref)Retrieve datastore locations for a given dataset.
getDatasetType(name)Get the
DatasetType.insertDatasets(datasetType, dataIds[, run, ...])Insert one or more datasets into the
Registry.insertDimensionData(element, *data[, ...])Insert one or more dimension records into the database.
Return
Trueif this registry allows write operations, andFalseotherwise.queryCollections([expression, datasetType, ...])Iterate over the collections whose names match an expression.
queryDataIds(dimensions, *[, dataId, ...])Query for data IDs matching user-provided criteria.
queryDatasetAssociations(datasetType[, ...])Iterate over dataset-collection combinations where the dataset is in the collection.
queryDatasetTypes([expression, components, ...])Iterate over the dataset types whose names match an expression.
queryDatasets(datasetType, *[, collections, ...])Query for and iterate over dataset references matching user-provided criteria.
queryDimensionRecords(element, *[, dataId, ...])Query for dimension information matching user-provided criteria.
refresh()Refresh all in-memory state by querying the database.
registerCollection(name[, type, doc])Add a new collection if one with the given name does not exist.
registerDatasetType(datasetType)Add a new
DatasetTypeto the Registry.registerRun(name[, doc])Add a new run if one with the given name does not exist.
removeCollection(name)Remove the given collection from the registry.
removeDatasetType(name)Remove the named
DatasetTypefrom the registry.removeDatasets(refs)Remove datasets from the Registry.
Reset connection pool for registry if relevant.
setCollectionChain(parent, children, *[, ...])Define or redefine a
CHAINEDcollection.setCollectionDocumentation(collection, doc)Set the documentation string for a collection.
supportsIdGenerationMode(mode)Test whether the given dataset ID generation mode is supported by
insertDatasets.syncDimensionData(element, row[, conform, ...])Synchronize the given dimension record with the database, inserting if it does not already exist and comparing values if it does.
transaction(*[, savepoint])Return a context manager that represents a transaction.
Attributes Documentation
- defaults¶
Default collection search path and/or output
RUNcollection (RegistryDefaults).This is an immutable struct whose components may not be set individually, but the entire struct can be set by assigning to this property.
- dimensions¶
Definitions of all dimensions recognized by this
Registry(DimensionUniverse).
- obsCoreTableManager¶
The ObsCore manager instance for this registry (
ObsCoreTableManagerorNone).ObsCore manager may not be implemented for all registry backend, or may not be enabled for many repositories.
Methods Documentation
- abstract associate(collection: str, refs: Iterable[DatasetRef]) None¶
Add existing datasets to a
TAGGEDcollection.If a DatasetRef with the same exact ID is already in a collection nothing is changed. If a
DatasetRefwith the sameDatasetTypeand data ID but with different ID exists in the collection,ConflictingDefinitionErroris raised.- Parameters:
- collection
str Indicates the collection the datasets should be associated with.
- refs
Iterable[DatasetRef] An iterable of resolved
DatasetRefinstances that already exist in thisRegistry.
- collection
- Raises:
- lsst.daf.butler.registry.ConflictingDefinitionError
If a Dataset with the given
DatasetRefalready exists in the given collection.- lsst.daf.butler.registry.MissingCollectionError
Raised if
collectiondoes not exist in the registry.- lsst.daf.butler.registry.CollectionTypeError
Raise adding new datasets to the given
collectionis not allowed.
- abstract caching_context() AbstractContextManager[None]¶
Context manager that enables caching.
- abstract certify(collection: str, refs: Iterable[DatasetRef], timespan: Timespan) None¶
Associate one or more datasets with a calibration collection and a validity range within it.
- Parameters:
- collection
str The name of an already-registered
CALIBRATIONcollection.- refs
Iterable[DatasetRef] Datasets to be associated.
- timespan
Timespan The validity range for these datasets within the collection.
- collection
- Raises:
- lsst.daf.butler.AmbiguousDatasetError
Raised if any of the given
DatasetRefinstances is unresolved.- lsst.daf.butler.registry.ConflictingDefinitionError
Raised if the collection already contains a different dataset with the same
DatasetTypeand data ID and an overlapping validity range.- lsst.daf.butler.registry.CollectionTypeError
Raised if
collectionis not aCALIBRATIONcollection or if one or more datasets are of a dataset type for whichDatasetType.isCalibrationreturnsFalse.
- abstract decertify(collection: str, datasetType: str | DatasetType, timespan: Timespan, *, dataIds: Iterable[DataCoordinate | Mapping[str, Any]] | None = None) None¶
Remove or adjust datasets to clear a validity range within a calibration collection.
- Parameters:
- collection
str The name of an already-registered
CALIBRATIONcollection.- datasetType
strorDatasetType Name or
DatasetTypeinstance for the datasets to be decertified.- timespan
Timespan, optional The validity range to remove datasets from within the collection. Datasets that overlap this range but are not contained by it will have their validity ranges adjusted to not overlap it, which may split a single dataset validity range into two.
- dataIdsiterable [
dictorDataCoordinate], optional Data IDs that should be decertified within the given validity range If
None, all data IDs forself.datasetTypewill be decertified.
- collection
- Raises:
- lsst.daf.butler.registry.CollectionTypeError
Raised if
collectionis not aCALIBRATIONcollection or ifdatasetType.isCalibration() is False.
- abstract disassociate(collection: str, refs: Iterable[DatasetRef]) None¶
Remove existing datasets from a
TAGGEDcollection.collectionandrefcombinations that are not currently associated are silently ignored.- Parameters:
- collection
str The collection the datasets should no longer be associated with.
- refs
Iterable[DatasetRef] An iterable of resolved
DatasetRefinstances that already exist in thisRegistry.
- collection
- Raises:
- lsst.daf.butler.AmbiguousDatasetError
Raised if any of the given dataset references is unresolved.
- lsst.daf.butler.registry.MissingCollectionError
Raised if
collectiondoes not exist in the registry.- lsst.daf.butler.registry.CollectionTypeError
Raise adding new datasets to the given
collectionis not allowed.
- abstract expandDataId(dataId: DataCoordinate | Mapping[str, Any] | None = None, *, dimensions: Iterable[str] | DimensionGroup | DimensionGraph | None = None, graph: DimensionGraph | None = None, records: NamedKeyMapping[DimensionElement, DimensionRecord | None] | Mapping[str, DimensionRecord | None] | None = None, withDefaults: bool = True, **kwargs: Any) DataCoordinate¶
Expand a dimension-based data ID to include additional information.
- Parameters:
- dataId
DataCoordinateordict, optional Data ID to be expanded; augmented and overridden by
kwargs.- dimensions
Iterable[str],DimensionGroup, orDimensionGraph, optional The dimensions to be identified by the new
DataCoordinate. If not provided, will be inferred from the keys ofmappingand**kwargs, anduniversemust be provided unlessmappingis already aDataCoordinate.- graph
DimensionGraph, optional Like
dimensions, but as aDimensionGraphinstance. Ignored ifdimensionsis provided. Deprecated and will be removed after v27.- records
Mapping[str,DimensionRecord], optional Dimension record data to use before querying the database for that data, keyed by element name.
- withDefaults
bool, optional Utilize
self.defaults.dataIdto fill in missing governor dimension key-value pairs. Defaults toTrue(i.e. defaults are used).- **kwargs
Additional keywords are treated like additional key-value pairs for
dataId, extending and overriding.
- dataId
- Returns:
- expanded
DataCoordinate A data ID that includes full metadata for all of the dimensions it identifies, i.e. guarantees that
expanded.hasRecords()andexpanded.hasFull()both returnTrue.
- expanded
- Raises:
- lsst.daf.butler.registry.DataIdError
Raised when
dataIdor keyword arguments specify unknown dimensions or values, or when a resulting data ID contains contradictory key-value pairs, according to dimension relationships.
Notes
This method cannot be relied upon to reject invalid data ID values for dimensions that do actually not have any record columns. For efficiency reasons the records for these dimensions (which have only dimension key values that are given by the caller) may be constructed directly rather than obtained from the registry database.
- abstract findDataset(datasetType: DatasetType | str, dataId: DataCoordinate | Mapping[str, Any] | None = None, *, collections: str | Pattern | Iterable[str | Pattern] | ellipsis | CollectionWildcard | None = None, timespan: Timespan | None = None, datastore_records: bool = False, **kwargs: Any) DatasetRef | None¶
Find a dataset given its
DatasetTypeand data ID.This can be used to obtain a
DatasetRefthat permits the dataset to be read from aDatastore. If the dataset is a component and can not be found using the provided dataset type, a dataset ref for the parent will be returned instead but with the correct dataset type.- Parameters:
- datasetType
DatasetTypeorstr A
DatasetTypeor the name of one. If this is aDatasetTypeinstance, its storage class will be respected and propagated to the output, even if it differs from the dataset type definition in the registry, as long as the storage classes are convertible.- dataId
dictorDataCoordinate, optional A
dict-like object containing theDimensionlinks that identify the dataset within a collection.- collectionscollection expression, optional
An expression that fully or partially identifies the collections to search for the dataset; see Collection expressions for more information. Defaults to
self.defaults.collections.- timespan
Timespan, optional A timespan that the validity range of the dataset must overlap. If not provided, any
CALIBRATIONcollections matched by thecollectionsargument will not be searched.- datastore_records
bool, optional Whether to attach datastore records to the
DatasetRef.- **kwargs
Additional keyword arguments passed to
DataCoordinate.standardizeto convertdataIdto a trueDataCoordinateor augment an existing one.
- datasetType
- Returns:
- ref
DatasetRef A reference to the dataset, or
Noneif no matching Dataset was found.
- ref
- Raises:
- lsst.daf.butler.registry.NoDefaultCollectionError
Raised if
collectionsisNoneandself.defaults.collectionsisNone.- LookupError
Raised if one or more data ID keys are missing.
- lsst.daf.butler.registry.MissingDatasetTypeError
Raised if the dataset type does not exist.
- lsst.daf.butler.registry.MissingCollectionError
Raised if any of
collectionsdoes not exist in the registry.
Notes
This method simply returns
Noneand does not raise an exception even when the set of collections searched is intrinsically incompatible with the dataset type, e.g. ifdatasetType.isCalibration() is False, but onlyCALIBRATIONcollections are being searched. This may make it harder to debug some lookup failures, but the behavior is intentional; we consider it more important that failed searches are reported consistently, regardless of the reason, and that adding additional collections that do not contain a match to the search path never changes the behavior.This method handles component dataset types automatically, though most other registry operations do not.
- abstract getCollectionChain(parent: str) Sequence[str]¶
Return the child collections in a
CHAINEDcollection.- Parameters:
- parent
str Name of the chained collection. Must have already been added via a call to
Registry.registerCollection.
- parent
- Returns:
- Raises:
- abstract getCollectionDocumentation(collection: str) str | None¶
Retrieve the documentation string for a collection.
- abstract getCollectionParentChains(collection: str) set[str]¶
Return the CHAINED collections that directly contain the given one.
- abstract getCollectionSummary(collection: str) CollectionSummary¶
Return a summary for the given collection.
- Parameters:
- collection
str Name of the collection for which a summary is to be retrieved.
- collection
- Returns:
- summary
CollectionSummary Summary of the dataset types and governor dimension values in this collection.
- summary
- abstract getCollectionType(name: str) CollectionType¶
Return an enumeration value indicating the type of the given collection.
- Parameters:
- name
str The name of the collection.
- name
- Returns:
- type
CollectionType Enum value indicating the type of this collection.
- type
- Raises:
- lsst.daf.butler.registry.MissingCollectionError
Raised if no collection with the given name exists.
- abstract getDataset(id: UUID) DatasetRef | None¶
Retrieve a Dataset entry.
- Parameters:
- id
DatasetId The unique identifier for the dataset.
- id
- Returns:
- ref
DatasetReforNone A ref to the Dataset, or
Noneif no matching Dataset was found.
- ref
- abstract getDatasetLocations(ref: DatasetRef) Iterable[str]¶
Retrieve datastore locations for a given dataset.
- Parameters:
- ref
DatasetRef A reference to the dataset for which to retrieve storage information.
- ref
- Returns:
- Raises:
- lsst.daf.butler.AmbiguousDatasetError
Raised if
ref.idisNone.
- abstract getDatasetType(name: str) DatasetType¶
Get the
DatasetType.- Parameters:
- name
str Name of the type.
- name
- Returns:
- type
DatasetType The
DatasetTypeassociated with the given name.
- type
- Raises:
- lsst.daf.butler.registry.MissingDatasetTypeError
Raised if the requested dataset type has not been registered.
Notes
This method handles component dataset types automatically, though most other registry operations do not.
- abstract insertDatasets(datasetType: DatasetType | str, dataIds: Iterable[DataCoordinate | Mapping[str, Any]], run: str | None = None, expand: bool = True, idGenerationMode: DatasetIdGenEnum = DatasetIdGenEnum.UNIQUE) list[lsst.daf.butler._dataset_ref.DatasetRef]¶
Insert one or more datasets into the
Registry.This always adds new datasets; to associate existing datasets with a new collection, use
associate.- Parameters:
- datasetType
DatasetTypeorstr A
DatasetTypeor the name of one.- dataIds
IterableofdictorDataCoordinate Dimension-based identifiers for the new datasets.
- run
str, optional The name of the run that produced the datasets. Defaults to
self.defaults.run.- expand
bool, optional If
True(default), expand data IDs as they are inserted. This is necessary in general to allow datastore to generate file templates, but it may be disabled if the caller can guarantee this is unnecessary.- idGenerationMode
DatasetIdGenEnum, optional Specifies option for generating dataset IDs. By default unique IDs are generated for each inserted dataset.
- datasetType
- Returns:
- refs
listofDatasetRef Resolved
DatasetRefinstances for all given data IDs (in the same order).
- refs
- Raises:
- lsst.daf.butler.registry.DatasetTypeError
Raised if
datasetTypeis not known to registry.- lsst.daf.butler.registry.CollectionTypeError
Raised if
runcollection type is notRUN.- lsst.daf.butler.registry.NoDefaultCollectionError
- lsst.daf.butler.registry.ConflictingDefinitionError
If a dataset with the same dataset type and data ID as one of those given already exists in
run.- lsst.daf.butler.registry.MissingCollectionError
Raised if
rundoes not exist in the registry.
- abstract insertDimensionData(element: DimensionElement | str, *data: Mapping[str, Any] | DimensionRecord, conform: bool = True, replace: bool = False, skip_existing: bool = False) None¶
Insert one or more dimension records into the database.
- Parameters:
- element
DimensionElementorstr The
DimensionElementor name thereof that identifies the table records will be inserted into.- *data
dictorDimensionRecord One or more records to insert.
- conform
bool, optional If
False(Trueis default) perform no checking or conversions, and assume thatelementis aDimensionElementinstance anddatais a one or moreDimensionRecordinstances of the appropriate subclass.- replace
bool, optional If
True(Falseis default), replace existing records in the database if there is a conflict.- skip_existing
bool, optional If
True(Falseis default), skip insertion if a record with the same primary key values already exists. UnlikesyncDimensionData, this will not detect when the given record differs from what is in the database, and should not be used when this is a concern.
- element
- abstract isWriteable() bool¶
Return
Trueif this registry allows write operations, andFalseotherwise.
- abstract queryCollections(expression: Any = Ellipsis, datasetType: DatasetType | None = None, collectionTypes: Iterable[CollectionType] | CollectionType = frozenset({CollectionType.RUN, CollectionType.TAGGED, CollectionType.CHAINED, CollectionType.CALIBRATION}), flattenChains: bool = False, includeChains: bool | None = None) Sequence[str]¶
Iterate over the collections whose names match an expression.
- Parameters:
- expressioncollection expression, optional
An expression that identifies the collections to return, such as a
str(for full matches or partial matches via globs),re.Pattern(for partial matches), or iterable thereof....can be used to return all collections, and is the default. See Collection expressions for more information.- datasetType
DatasetType, optional If provided, only yield collections that may contain datasets of this type. This is a conservative approximation in general; it may yield collections that do not have any such datasets.
- collectionTypes
Set[CollectionType] orCollectionType, optional If provided, only yield collections of these types.
- flattenChains
bool, optional If
True(Falseis default), recursively yield the child collections of matchingCHAINEDcollections.- includeChains
bool, optional If
True, yield records for matchingCHAINEDcollections. Default is the opposite offlattenChains: include either CHAINED collections or their children, but not both.
- Returns:
- Raises:
- lsst.daf.butler.registry.CollectionExpressionError
Raised when
expressionis invalid.
Notes
The order in which collections are returned is unspecified, except that the children of a
CHAINEDcollection are guaranteed to be in the order in which they are searched. When multiple parentCHAINEDcollections match the same criteria, the order in which the two lists appear is unspecified, and the lists of children may be incomplete if a child has multiple parents.
- abstract queryDataIds(dimensions: DimensionGroup | Iterable[Dimension | str] | Dimension | str, *, dataId: DataCoordinate | Mapping[str, Any] | None = None, datasets: Any = None, collections: str | Pattern | Iterable[str | Pattern] | ellipsis | CollectionWildcard | None = None, where: str = '', components: bool = False, bind: Mapping[str, Any] | None = None, check: bool = True, **kwargs: Any) DataCoordinateQueryResults¶
Query for data IDs matching user-provided criteria.
- Parameters:
- dimensions
DimensionGroup,Dimension, orstr, orIterable[Dimensionorstr] The dimensions of the data IDs to yield, as either
Dimensioninstances orstr. Will be automatically expanded to a completeDimensionGroup. Support forDimensioninstances is deprecated and will not be supported after v27.- dataId
dictorDataCoordinate, optional A data ID whose key-value pairs are used as equality constraints in the query.
- datasetsdataset type expression, optional
An expression that fully or partially identifies dataset types that should constrain the yielded data IDs. For example, including “raw” here would constrain the yielded
instrument,exposure,detector, andphysical_filtervalues to only those for which at least one “raw” dataset exists incollections. Allowed types includeDatasetType,str, and iterables thereof. Regular expression objects (i.e.re.Pattern) are deprecated and will be removed after the v26 release. See DatasetType expressions for more information.- collectionscollection expression, optional
An expression that identifies the collections to search for datasets, such as a
str(for full matches or partial matches via globs),re.Pattern(for partial matches), or iterable thereof....can be used to search all collections (actually just allRUNcollections, because this will still find all datasets). If not provided,self.default.collectionsis used. Ignored unlessdatasetsis also passed. See Collection expressions for more information.- where
str, optional A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or (as a shortcut for the primary key column of a dimension table) dimension name. See Dimension expressions for more information.
- components
bool, optional Must be
False. Provided only for backwards compatibility. After v27 this argument will be removed entirely.- bind
Mapping, optional Mapping containing literal values that should be injected into the
whereexpression, keyed by the identifiers they replace. Values of collection type can be expanded in some cases; see Identifiers for more information.- check
bool, optional If
True(default) check the query for consistency before executing it. This may reject some valid queries that resemble common mistakes (e.g. queries for visits without specifying an instrument).- **kwargs
Additional keyword arguments are forwarded to
DataCoordinate.standardizewhen processing thedataIdargument (and may be used to provide a constraining data ID even when thedataIdargument isNone).
- dimensions
- Returns:
- dataIds
queries.DataCoordinateQueryResults Data IDs matching the given query parameters. These are guaranteed to identify all dimensions (
DataCoordinate.hasFullreturnsTrue), but will not containDimensionRecordobjects (DataCoordinate.hasRecordsreturnsFalse). Callexpandedon the returned object to fetch those (and consider usingmaterializeon the returned object first if the expected number of rows is very large). See documentation for those methods for additional information.
- dataIds
- Raises:
- lsst.daf.butler.registry.NoDefaultCollectionError
Raised if
collectionsisNoneandself.defaults.collectionsisNone.- lsst.daf.butler.registry.CollectionExpressionError
Raised when
collectionsexpression is invalid.- lsst.daf.butler.registry.DataIdError
Raised when
dataIdor keyword arguments specify unknown dimensions or values, or when they contain inconsistent values.- lsst.daf.butler.registry.DatasetTypeExpressionError
Raised when
datasetTypeexpression is invalid.- lsst.daf.butler.registry.UserExpressionError
Raised when
whereexpression is invalid.
- abstract queryDatasetAssociations(datasetType: str | DatasetType, collections: str | Pattern | Iterable[str | Pattern] | ellipsis | CollectionWildcard | None = Ellipsis, *, collectionTypes: Iterable[CollectionType] = frozenset({CollectionType.RUN, CollectionType.TAGGED, CollectionType.CHAINED, CollectionType.CALIBRATION}), flattenChains: bool = False) Iterator[DatasetAssociation]¶
Iterate over dataset-collection combinations where the dataset is in the collection.
This method is a temporary placeholder for better support for association results in
queryDatasets. It will probably be removed in the future, and should be avoided in production code whenever possible.- Parameters:
- datasetType
DatasetTypeorstr A dataset type object or the name of one.
- collectionscollection expression, optional
An expression that identifies the collections to search for datasets, such as a
str(for full matches or partial matches via globs),re.Pattern(for partial matches), or iterable thereof....can be used to search all collections (actually just allRUNcollections, because this will still find all datasets). If not provided,self.default.collectionsis used. See Collection expressions for more information.- collectionTypes
Set[CollectionType], optional If provided, only yield associations from collections of these types.
- flattenChains
bool, optional If
True, search in the children ofCHAINEDcollections. IfFalse,CHAINEDcollections are ignored.
- datasetType
- Yields:
- association
DatasetAssociation Object representing the relationship between a single dataset and a single collection.
- association
- Raises:
- abstract queryDatasetTypes(expression: Any = Ellipsis, *, components: bool = False, missing: list[str] | None = None) Iterable[DatasetType]¶
Iterate over the dataset types whose names match an expression.
- Parameters:
- expressiondataset type expression, optional
An expression that fully or partially identifies the dataset types to return, such as a
str,re.Pattern, or iterable thereof....can be used to return all dataset types, and is the default. See DatasetType expressions for more information.- components
bool, optional Must be
False. Provided only for backwards compatibility. After v27 this argument will be removed entirely.- missing
listofstr, optional String dataset type names that were explicitly given (i.e. not regular expression patterns) but not found will be appended to this list, if it is provided.
- Returns:
- dataset_types
Iterable[DatasetType] An
IterableofDatasetTypeinstances whose names matchexpression.
- dataset_types
- Raises:
- lsst.daf.butler.registry.DatasetTypeExpressionError
Raised when
expressionis invalid.
- abstract queryDatasets(datasetType: Any, *, collections: str | Pattern | Iterable[str | Pattern] | ellipsis | CollectionWildcard | None = None, dimensions: Iterable[Dimension | str] | None = None, dataId: DataCoordinate | Mapping[str, Any] | None = None, where: str = '', findFirst: bool = False, components: bool = False, bind: Mapping[str, Any] | None = None, check: bool = True, **kwargs: Any) DatasetQueryResults¶
Query for and iterate over dataset references matching user-provided criteria.
- Parameters:
- datasetTypedataset type expression
An expression that fully or partially identifies the dataset types to be queried. Allowed types include
DatasetType,str,re.Pattern, and iterables thereof. The special value...can be used to query all dataset types. See DatasetType expressions for more information.- collectionscollection expression, optional
An expression that identifies the collections to search, such as a
str(for full matches or partial matches via globs),re.Pattern(for partial matches), or iterable thereof....can be used to search all collections (actually just allRUNcollections, because this will still find all datasets). If not provided,self.default.collectionsis used. See Collection expressions for more information.- dimensions
IterableofDimensionorstr Dimensions to include in the query (in addition to those used to identify the queried dataset type(s)), either to constrain the resulting datasets to those for which a matching dimension exists, or to relate the dataset type’s dimensions to dimensions referenced by the
dataIdorwherearguments.- dataId
dictorDataCoordinate, optional A data ID whose key-value pairs are used as equality constraints in the query.
- where
str, optional A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or (as a shortcut for the primary key column of a dimension table) dimension name. See Dimension expressions for more information.
- findFirst
bool, optional If
True(Falseis default), for each result data ID, only yield oneDatasetRefof eachDatasetType, from the first collection in which a dataset of that dataset type appears (according to the order ofcollectionspassed in). IfTrue,collectionsmust not contain regular expressions and may not be....- components
bool, optional Must be
False. Provided only for backwards compatibility. After v27 this argument will be removed entirely.- bind
Mapping, optional Mapping containing literal values that should be injected into the
whereexpression, keyed by the identifiers they replace. Values of collection type can be expanded in some cases; see Identifiers for more information.- check
bool, optional If
True(default) check the query for consistency before executing it. This may reject some valid queries that resemble common mistakes (e.g. queries for visits without specifying an instrument).- **kwargs
Additional keyword arguments are forwarded to
DataCoordinate.standardizewhen processing thedataIdargument (and may be used to provide a constraining data ID even when thedataIdargument isNone).
- Returns:
- refs
queries.DatasetQueryResults Dataset references matching the given query criteria. Nested data IDs are guaranteed to include values for all implied dimensions (i.e.
DataCoordinate.hasFullwill returnTrue), but will not include dimension records (DataCoordinate.hasRecordswill beFalse) unlessexpandedis called on the result object (which returns a new one).
- refs
- Raises:
- lsst.daf.butler.registry.DatasetTypeExpressionError
Raised when
datasetTypeexpression is invalid.- TypeError
Raised when the arguments are incompatible, such as when a collection wildcard is passed when
findFirstisTrue, or whencollectionsisNoneandself.defaults.collectionsis alsoNone.- lsst.daf.butler.registry.DataIdError
Raised when
dataIdor keyword arguments specify unknown dimensions or values, or when they contain inconsistent values.- lsst.daf.butler.registry.UserExpressionError
Raised when
whereexpression is invalid.
Notes
When multiple dataset types are queried in a single call, the results of this operation are equivalent to querying for each dataset type separately in turn, and no information about the relationships between datasets of different types is included. In contexts where that kind of information is important, the recommended pattern is to use
queryDataIdsto first obtain data IDs (possibly with the desired dataset types and collections passed as constraints to the query), and then use multiple (generally much simpler) calls toqueryDatasetswith the returned data IDs passed as constraints.
- abstract queryDimensionRecords(element: DimensionElement | str, *, dataId: DataCoordinate | Mapping[str, Any] | None = None, datasets: Any = None, collections: str | Pattern | Iterable[str | Pattern] | ellipsis | CollectionWildcard | None = None, where: str = '', components: bool = False, bind: Mapping[str, Any] | None = None, check: bool = True, **kwargs: Any) DimensionRecordQueryResults¶
Query for dimension information matching user-provided criteria.
- Parameters:
- element
DimensionElementorstr The dimension element to obtain records for.
- dataId
dictorDataCoordinate, optional A data ID whose key-value pairs are used as equality constraints in the query.
- datasetsdataset type expression, optional
An expression that fully or partially identifies dataset types that should constrain the yielded records. See
queryDataIdsand DatasetType expressions for more information.- collectionscollection expression, optional
An expression that identifies the collections to search for datasets, such as a
str(for full matches or partial matches via globs),re.Pattern(for partial matches), or iterable thereof....can be used to search all collections (actually just allRUNcollections, because this will still find all datasets). If not provided,self.default.collectionsis used. Ignored unlessdatasetsis also passed. See Collection expressions for more information.- where
str, optional A string expression similar to a SQL WHERE clause. See
queryDataIdsand Dimension expressions for more information.- components
bool, optional Must be
False. Provided only for backwards compatibility. After v27 this argument will be removed entirely.- bind
Mapping, optional Mapping containing literal values that should be injected into the
whereexpression, keyed by the identifiers they replace. Values of collection type can be expanded in some cases; see Identifiers for more information.- check
bool, optional If
True(default) check the query for consistency before executing it. This may reject some valid queries that resemble common mistakes (e.g. queries for visits without specifying an instrument).- **kwargs
Additional keyword arguments are forwarded to
DataCoordinate.standardizewhen processing thedataIdargument (and may be used to provide a constraining data ID even when thedataIdargument isNone).
- element
- Returns:
- dataIds
queries.DimensionRecordQueryResults Data IDs matching the given query parameters.
- dataIds
- Raises:
- lsst.daf.butler.registry.NoDefaultCollectionError
Raised if
collectionsisNoneandself.defaults.collectionsisNone.- lsst.daf.butler.registry.CollectionExpressionError
Raised when
collectionsexpression is invalid.- lsst.daf.butler.registry.DataIdError
Raised when
dataIdor keyword arguments specify unknown dimensions or values, or when they contain inconsistent values.- lsst.daf.butler.registry.DatasetTypeExpressionError
Raised when
datasetTypeexpression is invalid.- lsst.daf.butler.registry.UserExpressionError
Raised when
whereexpression is invalid.
- abstract refresh() None¶
Refresh all in-memory state by querying the database.
This may be necessary to enable querying for entities added by other registry instances after this one was constructed.
- abstract registerCollection(name: str, type: CollectionType = CollectionType.TAGGED, doc: str | None = None) bool¶
Add a new collection if one with the given name does not exist.
- Parameters:
- name
str The name of the collection to create.
- type
CollectionType Enum value indicating the type of collection to create.
- doc
str, optional Documentation string for the collection.
- name
- Returns:
- registered
bool Boolean indicating whether the collection was already registered or was created by this call.
- registered
Notes
This method cannot be called within transactions, as it needs to be able to perform its own transaction to be concurrent.
- abstract registerDatasetType(datasetType: DatasetType) bool¶
Add a new
DatasetTypeto the Registry.It is not an error to register the same
DatasetTypetwice.- Parameters:
- datasetType
DatasetType The
DatasetTypeto be added.
- datasetType
- Returns:
- inserted
bool TrueifdatasetTypewas inserted,Falseif an identical existingDatasetTypewas found. Note that in either case the DatasetType is guaranteed to be defined in the Registry consistently with the given definition.
- inserted
- Raises:
- ValueError
Raised if the dimensions or storage class are invalid.
- lsst.daf.butler.registry.ConflictingDefinitionError
Raised if this
DatasetTypeis already registered with a different definition.
Notes
This method cannot be called within transactions, as it needs to be able to perform its own transaction to be concurrent.
- abstract registerRun(name: str, doc: str | None = None) bool¶
Add a new run if one with the given name does not exist.
- Parameters:
- Returns:
Notes
This method cannot be called within transactions, as it needs to be able to perform its own transaction to be concurrent.
- abstract removeCollection(name: str) None¶
Remove the given collection from the registry.
- Parameters:
- name
str The name of the collection to remove.
- name
- Raises:
- lsst.daf.butler.registry.MissingCollectionError
Raised if no collection with the given name exists.
- sqlalchemy.exc.IntegrityError
Raised if the database rows associated with the collection are still referenced by some other table, such as a dataset in a datastore (for
RUNcollections only) or aCHAINEDcollection of which this collection is a child.
Notes
If this is a
RUNcollection, all datasets and quanta in it will removed from theRegistrydatabase. This requires that those datasets be removed (or at least trashed) from any datastores that hold them first.A collection may not be deleted as long as it is referenced by a
CHAINEDcollection; theCHAINEDcollection must be deleted or redefined first.
- abstract removeDatasetType(name: str | tuple[str, ...]) None¶
Remove the named
DatasetTypefrom the registry.Warning
Registry implementations can cache the dataset type definitions. This means that deleting the dataset type definition may result in unexpected behavior from other butler processes that are active that have not seen the deletion.
- Parameters:
- Raises:
- lsst.daf.butler.registry.OrphanedRecordError
Raised if an attempt is made to remove the dataset type definition when there are already datasets associated with it.
Notes
If the dataset type is not registered the method will return without action.
- abstract removeDatasets(refs: Iterable[DatasetRef]) None¶
Remove datasets from the Registry.
The datasets will be removed unconditionally from all collections, and any
Quantumthat consumed this dataset will instead be marked with having a NULL input.Datastorerecords will not be deleted; the caller is responsible for ensuring that the dataset has already been removed from all Datastores.- Parameters:
- refs
Iterable[DatasetRef] References to the datasets to be removed. Must include a valid
idattribute, and should be considered invalidated upon return.
- refs
- Raises:
- resetConnectionPool() None¶
Reset connection pool for registry if relevant.
This operation can be used reset connections to servers when using registry with fork-based multiprocessing. This method should usually be called by the child process immediately after the fork.
The base class implementation is a no-op.
- abstract setCollectionChain(parent: str, children: Any, *, flatten: bool = False) None¶
Define or redefine a
CHAINEDcollection.- Parameters:
- parent
str Name of the chained collection. Must have already been added via a call to
Registry.registerCollection.- childrencollection expression
An expression defining an ordered search of child collections, generally an iterable of
str; see Collection expressions for more information.- flatten
bool, optional If
True(Falseis default), recursively flatten out any nestedCHAINEDcollections inchildrenfirst.
- parent
- Raises:
- abstract setCollectionDocumentation(collection: str, doc: str | None) None¶
Set the documentation string for a collection.
- abstract supportsIdGenerationMode(mode: DatasetIdGenEnum) bool¶
Test whether the given dataset ID generation mode is supported by
insertDatasets.- Parameters:
- mode
DatasetIdGenEnum Enum value for the mode to test.
- mode
- Returns:
- supported
bool Whether the given mode is supported.
- supported
- abstract syncDimensionData(element: DimensionElement | str, row: Mapping[str, Any] | DimensionRecord, conform: bool = True, update: bool = False) bool | dict[str, Any]¶
Synchronize the given dimension record with the database, inserting if it does not already exist and comparing values if it does.
- Parameters:
- element
DimensionElementorstr The
DimensionElementor name thereof that identifies the table records will be inserted into.- row
dictorDimensionRecord The record to insert.
- conform
bool, optional If
False(Trueis default) perform no checking or conversions, and assume thatelementis aDimensionElementinstance anddatais a one or moreDimensionRecordinstances of the appropriate subclass.- update
bool, optional If
True(Falseis default), update the existing record in the database if there is a conflict.
- element
- Returns:
- Raises:
- lsst.daf.butler.registry.ConflictingDefinitionError
Raised if the record exists in the database (according to primary key lookup) but is inconsistent with the given one.