SqlRegistry¶
- 
class lsst.daf.butler.registries.sqlRegistry.SqlRegistry(registryConfig, schemaConfig, dimensionConfig, create=False, butlerRoot=None)¶
- Bases: - lsst.daf.butler.Registry- Registry backed by a SQL database. - Parameters: - registryConfig : SqlRegistryConfigorstr
- Load configuration 
- schemaConfig : SchemaConfigorstr
- Definition of the schema to use. 
- dimensionConfig : DimensionConfigorConfigor
- DimensionGraphconfiguration.
- create : bool
- Assume registry is empty and create a new one. 
 - Attributes Summary - defaultConfigFile- Path to configuration defaults. - Methods Summary - addDataset(datasetType, dataId, run[, …])- Adds a Dataset entry to the - Registry- addDatasetLocation(ref, datastoreName)- Add datastore name locating a given dataset. - addExecution(execution)- Add a new - Executionto the- Registry.- addRun(run)- Add a new - Runto the- Registry.- associate(collection, refs)- Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. - attachComponent(name, parent, component)- Attach a component to a dataset. - deleteOpaqueData(name, **where)- Remove records from an opaque table. - disassociate(collection, refs)- Remove existing Datasets from a collection. - ensureRun(run)- Conditionally add a new - Runto the- Registry.- expandDataId(dataId, Mapping[str, Any], …)- Expand a dimension-based data ID to include additional information. - fetchOpaqueData(name, **where)- Retrieve records from an opaque table. - find(collection, datasetType[, dataId])- Lookup a dataset. - fromConfig(registryConfig[, schemaConfig, …])- Create - Registrysubclass instance from- config.- getAllCollections()- Get names of all the collections found in this repository. - getAllDatasetTypes()- Get every registered - DatasetType.- getDataset(id[, datasetType, dataId])- Retrieve a Dataset entry. - getDatasetLocations(ref)- Retrieve datastore locations for a given dataset. - getDatasetType(name)- Get the - DatasetType.- getExecution(id)- Retrieve an Execution. - getRun([id, collection])- Get a - Runcorresponding to its collection or id- insertDimensionData(element, str], *data, …)- Insert one or more dimension records into the database. - insertOpaqueData(name, *data)- Insert records into an opaque table. - makeQueryBuilder(summary)- Return a - QueryBuilderinstance capable of constructing and managing more complex queries than those obtainable via- Registryinterfaces.- makeRun(collection)- Create a new - Runin the- Registryand return it.- query(sql, **params)- Execute a SQL SELECT statement directly. - queryDatasets(datasetType, str, …)- Query for and iterate over dataset references matching user-provided criteria. - queryDimensions(dimensions, str]], …)- Query for and iterate over data IDs matching user-provided criteria. - registerDatasetType(datasetType)- Add a new - DatasetTypeto the Registry.- registerOpaqueTable(name, spec)- Add an opaque (to the - Registry) table for use by a- Datastoreor other data repository client.- removeDataset(ref)- Remove a dataset from the Registry. - removeDatasetLocation(datastoreName, ref)- Remove datastore location associated with this dataset. - setConfigRoot(root, config, full[, overwrite])- Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. - transaction()- Context manager that implements SQL transactions. - Attributes Documentation - 
defaultConfigFile= None¶
- Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified. 
 - Methods Documentation - 
addDataset(datasetType, dataId, run, producer=None, recursive=False, **kwds)¶
- Adds a Dataset entry to the - Registry- This always adds a new Dataset; to associate an existing Dataset with a new collection, use - associate.- Parameters: - datasetType : DatasetTypeorstr
- A - DatasetTypeor the name of one.
- dataId : dictorDataCoordinate
- A - dict-like object containing the- Dimensionlinks that identify the dataset within a collection.
- run : Run
- The - Runinstance that produced the Dataset. Ignored if- produceris passed (- producer.runis then used instead). A Run must be provided by one of the two arguments.
- producer : Quantum
- Unit of work that produced the Dataset. May be - Noneto store no provenance information, but if present the- Quantummust already have been added to the Registry.
- recursive : bool
- If True, recursively add Dataset and attach entries for component Datasets as well. 
- kwds
- Additional keyword arguments passed to - DataCoordinate.standardizeto convert- dataIdto a true- DataCoordinateor augment an existing one.
 - Returns: - ref : DatasetRef
- A newly-created - DatasetRefinstance.
 - Raises: - ConflictingDefinitionError
- If a Dataset with the given - DatasetRefalready exists in the given collection.
- Exception
- If - dataIdcontains unknown or invalid- Dimensionentries.
 
- datasetType : 
 - 
addDatasetLocation(ref, datastoreName)¶
- Add datastore name locating a given dataset. - Typically used by - Datastore.- Parameters: - ref : DatasetRef
- A reference to the dataset for which to add storage information. 
- datastoreName : str
- Name of the datastore holding this dataset. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- ref : 
 - 
addExecution(execution)¶
- Add a new - Executionto the- Registry.- If - execution.idis- Nonethe- Registrywill update it to that of the newly inserted entry.- Parameters: - execution : Execution
- Instance to add to the - Registry. The given- Executionmust not already be present in the- Registry.
 - Raises: - ConflictingDefinitionError
- If - executionis already present in the- Registry.
 
- execution : 
 - 
addRun(run)¶
- Add a new - Runto the- Registry.- Parameters: - Raises: - ConflictingDefinitionError
- If a run already exists with this collection. 
 
 - 
associate(collection, refs)¶
- Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. - If a DatasetRef with the same exact - dataset_idis already in a collection nothing is changed. If a- DatasetRefwith the same- DatasetType1and dimension values but with different- dataset_idexists in the collection,- ValueErroris raised.- Parameters: - collection : str
- Indicates the collection the Datasets should be associated with. 
- refs : iterable of DatasetRef
- An iterable of - DatasetRefinstances that already exist in this- Registry. All component datasets will be associated with the collection as well.
 - Raises: - ConflictingDefinitionError
- If a Dataset with the given - DatasetRefalready exists in the given collection.
 
- collection : 
 - 
attachComponent(name, parent, component)¶
- Attach a component to a dataset. - Parameters: - name : str
- Name of the component. 
- parent : DatasetRef
- A reference to the parent dataset. Will be updated to reference the component. 
- component : DatasetRef
- A reference to the component dataset. 
 - Raises: - AmbiguousDatasetError
- Raised if - parent.idor- component.idis- None.
 
- name : 
 - 
deleteOpaqueData(name: str, **where)¶
- Remove records from an opaque table. - Parameters: - name : str
- Logical name of the opaque table. Must match the name used in a previous call to - registerOpaqueTable.
- where
- Additional keyword arguments are interpreted as equality constraints that restrict the deketed rows (combined with AND); keyword arguments are column names and values are the values they must have. 
 
- name : 
 - 
disassociate(collection, refs)¶
- Remove existing Datasets from a collection. - collectionand- refcombinations that are not currently associated are silently ignored.- Parameters: - Raises: - AmbiguousDatasetError
- Raised if - any(ref.id is None for ref in refs).
 
 - 
ensureRun(run)¶
- Conditionally add a new - Runto the- Registry.- If the - run.idis- Noneor a- Runwith this- idor- collectiondoesn’t exist in the- Registryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.- Parameters: - run : Run
- Instance to add to the - Registry.
 - Raises: - ConflictingDefinitionError
- If - runalready exists, but is not identical.
 
- run : 
 - 
expandDataId(dataId: Union[lsst.daf.butler.core.dimensions.coordinate.DataCoordinate, Mapping[str, Any], None] = None, *, graph: Optional[lsst.daf.butler.core.dimensions.graph.DimensionGraph] = None, records: Optional[Mapping[lsst.daf.butler.core.dimensions.elements.DimensionElement, lsst.daf.butler.core.dimensions.records.DimensionRecord]] = None, **kwds)¶
- Expand a dimension-based data ID to include additional information. 
 - 
fetchOpaqueData(name: str, **where) → Iterator[dict]¶
- Retrieve records from an opaque table. - Parameters: - name : str
- Logical name of the opaque table. Must match the name used in a previous call to - registerOpaqueTable.
- where
- Additional keyword arguments are interpreted as equality constraints that restrict the returned rows (combined with AND); keyword arguments are column names and values are the values they must have. 
 - Yields: - row : dict
- A dictionary representing a single result row. 
 
- name : 
 - 
find(collection, datasetType, dataId=None, **kwds)¶
- Lookup a dataset. - This can be used to obtain a - DatasetRefthat permits the dataset to be read from a- Datastore.- Parameters: - collection : str
- Identifies the collection to search. 
- datasetType : DatasetTypeorstr
- A - DatasetTypeor the name of one.
- dataId : dictorDataCoordinate, optional
- A - dict-like object containing the- Dimensionlinks that identify the dataset within a collection.
- kwds
- Additional keyword arguments passed to - DataCoordinate.standardizeto convert- dataIdto a true- DataCoordinateor augment an existing one.
 - Returns: - ref : DatasetRef
- A ref to the Dataset, or - Noneif no matching Dataset was found.
 - Raises: - LookupError
- If one or more data ID keys are missing. 
 
- collection : 
 - 
static fromConfig(registryConfig, schemaConfig=None, dimensionConfig=None, create=False, butlerRoot=None)¶
- Create - Registrysubclass instance from- config.- Uses - registry.clsfrom- configto determine which subclass to instantiate.- Parameters: - registryConfig : ButlerConfig,RegistryConfig,Configorstr
- Registry configuration 
- schemaConfig : SchemaConfig,Configorstr, optional.
- Schema configuration. Can be read from supplied registryConfig if the relevant component is defined and - schemaConfigis- None.
- dimensionConfig : DimensionConfigorConfigor
- str, optional.- DimensionGraphconfiguration. Can be read from supplied registryConfig if the relevant component is defined and- dimensionConfigis- None.
- create : bool
- Assume empty Registry and create a new one. 
 - Returns: - registry : Registry(subclass)
- A new - Registrysubclass instance.
 
- registryConfig : 
 - 
getAllCollections()¶
- Get names of all the collections found in this repository. - Returns: 
 - 
getAllDatasetTypes()¶
- Get every registered - DatasetType.- Returns: - types : frozensetofDatasetType
- Every - DatasetTypein the registry.
 
- types : 
 - 
getDataset(id, datasetType=None, dataId=None)¶
- Retrieve a Dataset entry. - Parameters: - id : int
- The unique identifier for the Dataset. 
- datasetType : DatasetType, optional
- The - DatasetTypeof the dataset to retrieve. This is used to short-circuit retrieving the- DatasetType, so if provided, the caller is guaranteeing that it is what would have been retrieved.
- dataId : DataCoordinate, optional
- A - Dimension-based identifier for the dataset within a collection, possibly containing additional metadata. This is used to short-circuit retrieving the dataId, so if provided, the caller is guaranteeing that it is what would have been retrieved.
 - Returns: - ref : DatasetRef
- A ref to the Dataset, or - Noneif no matching Dataset was found.
 
- id : 
 - 
getDatasetLocations(ref)¶
- Retrieve datastore locations for a given dataset. - Typically used by - Datastore.- Parameters: - ref : DatasetRef
- A reference to the dataset for which to retrieve storage information. 
 - Returns: - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- ref : 
 - 
getDatasetType(name)¶
- Get the - DatasetType.- Parameters: - name : str
- Name of the type. 
 - Returns: - type : DatasetType
- The - DatasetTypeassociated with the given name.
 - Raises: - KeyError
- Requested named DatasetType could not be found in registry. 
 
- name : 
 - 
getExecution(id)¶
- Retrieve an Execution. - Parameters: - id : int
- The unique identifier for the Execution. 
 
- id : 
 - 
getRun(id=None, collection=None)¶
- Get a - Runcorresponding to its collection or id- Parameters: - Returns: - run : Run
- The - Runinstance.
 - Raises: - ValueError
- Must supply one of - collectionor- id.
 
- run : 
 - 
insertDimensionData(element: Union[lsst.daf.butler.core.dimensions.elements.DimensionElement, str], *data, conform: bool = True)¶
- Insert one or more dimension records into the database. - Parameters: - element : DimensionElementorstr
- The - DimensionElementor name thereof that identifies the table records will be inserted into.
- data : dictorDimensionRecord(variadic)
- One or more records to insert. 
- conform : bool, optional
- If - False(- Trueis default) perform no checking or conversions, and assume that- elementis a- DimensionElementinstance and- datais a one or more- DimensionRecordinstances of the appropriate subclass.
 
- element : 
 - 
insertOpaqueData(name: str, *data)¶
- Insert records into an opaque table. - Parameters: - name : str
- Logical name of the opaque table. Must match the name used in a previous call to - registerOpaqueTable.
- data
- Each additional positional argument is a dictionary that represents a single row to be added. 
 
- name : 
 - 
makeQueryBuilder(summary: lsst.daf.butler.core.queries.structs.QuerySummary) → lsst.daf.butler.core.queries.builder.QueryBuilder¶
- Return a - QueryBuilderinstance capable of constructing and managing more complex queries than those obtainable via- Registryinterfaces.- This is an advanced - SqlRegistry-only interface; downstream code should prefer- Registry.queryDimensionsand- Registry.queryDatasetswhenever those are sufficient.- Parameters: - summary: `QuerySummary`
- Object describing and categorizing the full set of dimensions that will be included in the query. 
 - Returns: - builder : QueryBuilder
- Object that can be used to construct and perform advanced queries. 
 
 - 
makeRun(collection)¶
- Create a new - Runin the- Registryand return it.- If a run with this collection already exists, return that instead. - Parameters: - collection : str
- The collection used to identify all inputs and outputs of the - Run.
 - Returns: - run : Run
- A new - Runinstance.
 
- collection : 
 - 
query(sql, **params)¶
- Execute a SQL SELECT statement directly. - Named parameters are specified in the SQL query string by preceeding them with a colon. Parameter values are provided as additional keyword arguments. For example: - registry.query(“SELECT * FROM instrument WHERE instrument=:name”,
- name=”HSC”)
 - Parameters: - sql : str
- SQL query string. Must be a SELECT statement. 
- **params
- Parameter name-value pairs to insert into the query. 
 - Yields: - row : dict
- The next row result from executing the query. 
 
 - 
queryDatasets(datasetType: Union[lsst.daf.butler.core.datasets.type.DatasetType, str, lsst.daf.butler.core.queries.datasets.Like, ellipsis], *, collections: Union[Sequence[Union[str, lsst.daf.butler.core.queries.datasets.Like]], ellipsis], dimensions: Optional[Iterable[Union[lsst.daf.butler.core.dimensions.elements.Dimension, str]]] = None, dataId: Union[lsst.daf.butler.core.dimensions.coordinate.DataCoordinate, Mapping[str, Any], None] = None, where: Optional[str] = None, deduplicate: bool = False, expand: bool = True, **kwds) → Iterator[lsst.daf.butler.core.datasets.ref.DatasetRef]¶
- Query for and iterate over dataset references matching user-provided criteria. - Parameters: - datasetType : DatasetType,str,Like, or...
- An expression indicating type(s) of datasets to query for. - ...may be used to query for all known DatasetTypes. Multiple explicitly-provided dataset types cannot be queried in a single call to- queryDatasetseven though wildcard expressions can, because the results would be identical to chaining the iterators produced by multiple calls to- queryDatasets.
- collections: `~collections.abc.Sequence` of `str` or `Like`, or ``…``
- An expression indicating the collections to be searched for datasets. - ...may be passed to search all collections.
- dimensions : IterableofDimensionorstr
- Dimensions to include in the query (in addition to those used to identify the queried dataset type(s)), either to constrain the resulting datasets to those for which a matching dimension exists, or to relate the dataset type’s dimensions to dimensions referenced by the - dataIdor- wherearguments.
- dataId : dictorDataCoordinate, optional
- A data ID whose key-value pairs are used as equality constraints in the query. 
- where : str, optional
- A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or (as a shortcut for the primary key column of a dimension table) dimension name. 
- deduplicate : bool, optional
- If - True(- Falseis default), for each result data ID, only yield one- DatasetRefof each- DatasetType, from the first collection in which a dataset of that dataset type appears (according to the order of- collectionspassed in). Cannot be used if any element in- collectionsis an expression.
- expand : bool, optional
- If - True(default) attach- ExpandedDataCoordinateinstead of minimal- DataCoordinatebase-class instances.
- kwds
- Additional keyword arguments are forwarded to - DataCoordinate.standardizewhen processing the- dataIdargument (and may be used to provide a constraining data ID even when the- dataIdargument is- None).
 - Yields: - ref : DatasetRef
- Dataset references matching the given query criteria. These are grouped by - DatasetTypeif the query evaluates to multiple dataset types, but order is otherwise unspecified.
 - Raises: - TypeError
- Raised when the arguments are incompatible, such as when a collection wildcard is pass when - deduplicateis- True.
 - Notes - When multiple dataset types are queried via a wildcard expression, the results of this operation are equivalent to querying for each dataset type separately in turn, and no information about the relationships between datasets of different types is included. In contexts where that kind of information is important, the recommended pattern is to use - queryDimensionsto first obtain data IDs (possibly with the desired dataset types and collections passed as constraints to the query), and then use multiple (generally much simpler) calls to- queryDatasetswith the returned data IDs passed as constraints.
- datasetType : 
 - 
queryDimensions(dimensions: Union[Iterable[Union[lsst.daf.butler.core.dimensions.elements.Dimension, str]], lsst.daf.butler.core.dimensions.elements.Dimension, str], *, dataId: Union[lsst.daf.butler.core.dimensions.coordinate.DataCoordinate, Mapping[str, Any], None] = None, datasets: Optional[Mapping[Union[lsst.daf.butler.core.datasets.type.DatasetType, str, lsst.daf.butler.core.queries.datasets.Like, ellipsis], Union[Sequence[Union[str, lsst.daf.butler.core.queries.datasets.Like]], ellipsis]]] = None, where: Optional[str] = None, expand: bool = True, **kwds) → Iterator[lsst.daf.butler.core.dimensions.coordinate.DataCoordinate]¶
- Query for and iterate over data IDs matching user-provided criteria. - Parameters: - dimensions : Dimensionorstr, or iterable thereof
- The dimensions of the data IDs to yield, as either - Dimensioninstances or- str. Will be automatically expanded to a complete- DimensionGraph.
- dataId : dictorDataCoordinate, optional
- A data ID whose key-value pairs are used as equality constraints in the query. 
- datasets : Mapping, optional
- Datasets whose existence in the registry constrain the set of data IDs returned. This is a mapping from a dataset type expression (a - strname, a true- DatasetTypeinstance, a- Likepattern for the name, or- ...for all DatasetTypes) to a collections expression (a sequence of- stror- Likepatterns, or- for all collections).
- where : str, optional
- A string expression similar to a SQL WHERE clause. May involve any column of a dimension table or (as a shortcut for the primary key column of a dimension table) dimension name. 
- expand : bool, optional
- If - True(default) yield- ExpandedDataCoordinateinstead of minimal- DataCoordinatebase-class instances.
- kwds
- Additional keyword arguments are forwarded to - DataCoordinate.standardizewhen processing the- dataIdargument (and may be used to provide a constraining data ID even when the- dataIdargument is- None).
 - Yields: - dataId : DataCoordinate
- Data IDs matching the given query parameters. Order is unspecified. 
 
- dimensions : 
 - 
registerDatasetType(datasetType)¶
- Add a new - DatasetTypeto the Registry.- It is not an error to register the same - DatasetTypetwice.- Parameters: - datasetType : DatasetType
- The - DatasetTypeto be added.
 - Returns: - Raises: - ValueError
- Raised if the dimensions or storage class are invalid. 
- ConflictingDefinitionError
- Raised if this DatasetType is already registered with a different definition. 
 
- datasetType : 
 - 
registerOpaqueTable(name: str, spec: lsst.daf.butler.core.schema.TableSpec)¶
- Add an opaque (to the - Registry) table for use by a- Datastoreor other data repository client.- Opaque table records can be added via - insertOpaqueData, retrieved via- fetchOpaqueData, and removed via- deleteOpaqueData.- Parameters: - name : str
- Logical name of the opaque table. This may differ from the actual name used in the database by a prefix and/or suffix. 
- spec : TableSpec
- Specification for the table to be added. 
 
- name : 
 - 
removeDataset(ref)¶
- Remove a dataset from the Registry. - The dataset and all components will be removed unconditionally from all collections, and any associated - Quantumrecords will also be removed.- Datastorerecords will not be deleted; the caller is responsible for ensuring that the dataset has already been removed from all Datastores.- Parameters: - ref : DatasetRef
- Reference to the dataset to be removed. Must include a valid - idattribute, and should be considered invalidated upon return.
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
- OrphanedRecordError
- Raised if the dataset is still present in any - Datastore.
 
- ref : 
 - 
removeDatasetLocation(datastoreName, ref)¶
- Remove datastore location associated with this dataset. - Typically used by - Datastorewhen a dataset is removed.- Parameters: - datastoreName : str
- Name of this - Datastore.
- ref : DatasetRef
- A reference to the dataset for which information is to be removed. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- datastoreName : 
 - 
classmethod setConfigRoot(root, config, full, overwrite=True)¶
- Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. - Parameters: - root : str
- Filesystem path to the root of the data repository. 
- config : Config
- A - Configto update. Only the subset understood by this component will be updated. Will not expand defaults.
- full : Config
- A complete config with all defaults expanded that can be converted to a - RegistryConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from- fullto- config.
- overwrite : bool, optional
- If - False, do not modify a value in- configif the value already exists. Default is always to overwrite with the provided- root.
 - Notes - If a keyword is explicitly defined in the supplied - configit will not be overridden by this method if- overwriteis- False. This allows explicit values set in external configs to be retained.
- root : 
 - 
transaction()¶
- Context manager that implements SQL transactions. - Will roll back any changes to the - SqlRegistrydatabase in case an exception is raised in the enclosed block.- This context manager may be nested. 
 
- registryConfig :