SqlRegistry¶
-
class
lsst.daf.butler.registries.sqlRegistry.SqlRegistry(registryConfig, schemaConfig, dimensionConfig, create=False, butlerRoot=None)¶ Bases:
lsst.daf.butler.RegistryRegistry backed by a SQL database.
Parameters: - registryConfig :
SqlRegistryConfigorstr Load configuration
- schemaConfig :
SchemaConfigorstr Definition of the schema to use.
- dimensionConfig :
DimensionConfigorConfigor DimensionGraphconfiguration.- create :
bool Assume registry is empty and create a new one.
Attributes Summary
defaultConfigFilePath to configuration defaults. limitedIf True, this Registry does not maintain Dimension metadata or relationships ( bool).pixelizationObject that interprets skypix Dimension values ( lsst.sphgeom.Pixelization).Methods Summary
addDataset(datasetType, dataId, run[, …])Adds a Dataset entry to the RegistryaddDatasetLocation(ref, datastoreName)Add datastore name locating a given dataset. addDimensionEntry(dimension[, dataId, entry])Add a new Dimensionentry.addDimensionEntryList(dimension, dataIdList)Add a new Dimensionentry.addExecution(execution)Add a new Executionto theRegistry.addRun(run)Add a new Runto theRegistry.associate(collection, refs)Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. attachComponent(name, parent, component)Attach a component to a dataset. disassociate(collection, refs)Remove existing Datasets from a collection. ensureRun(run)Conditionally add a new Runto theRegistry.expandDataId([dataId, dimension, metadata, …])Expand a data ID to include additional information. find(collection, datasetType[, dataId])Lookup a dataset. findDimensionEntries(dimension)Return all Dimensionentries corresponding to the named dimension.findDimensionEntry(dimension[, dataId])Return a Dimensionentry corresponding to aDataId.fromConfig(registryConfig[, schemaConfig, …])Create Registrysubclass instance fromconfig.getAllCollections()Get names of all the collections found in this repository. getAllDatasetTypes()Get every registered DatasetType.getDataset(id[, datasetType, dataId])Retrieve a Dataset entry. getDatasetLocations(ref)Retrieve datastore locations for a given dataset. getDatasetType(name)Get the DatasetType.getExecution(id)Retrieve an Execution. getRun([id, collection])Get a Runcorresponding to its collection or idmakeDataIdPacker(name[, dataId])Create an object that can pack certain data IDs into integers. makeDatabaseDict(table, types, key, value[, …])Construct a DatabaseDict backed by a table in the same database as this Registry. makeRun(collection)Create a new Runin theRegistryand return it.packDataId(name[, dataId, returnMaxBits])Pack the given DataIdinto an integer.query(sql, **params)Execute a SQL SELECT statement directly. registerDatasetType(datasetType)Add a new DatasetTypeto the Registry.removeDataset(ref)Remove a dataset from the Registry. removeDatasetLocation(datastoreName, ref)Remove datastore location associated with this dataset. setConfigRoot(root, config, full[, overwrite])Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. setDimensionRegion([dataId, update, region])Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables. transaction()Context manager that implements SQL transactions. Attributes Documentation
-
defaultConfigFile= None¶ Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified.
-
pixelization¶ Object that interprets skypix Dimension values (
lsst.sphgeom.Pixelization).Nonefor limited registries.
Methods Documentation
-
addDataset(datasetType, dataId, run, producer=None, recursive=False, **kwds)¶ Adds a Dataset entry to the
RegistryThis always adds a new Dataset; to associate an existing Dataset with a new collection, use
associate.Parameters: - datasetType :
DatasetTypeorstr A
DatasetTypeor the name of one.- dataId :
dictorDataId A
dict-like object containing theDimensionlinks that identify the dataset within a collection.- run :
Run The
Runinstance that produced the Dataset. Ignored ifproduceris passed (producer.runis then used instead). A Run must be provided by one of the two arguments.- producer :
Quantum Unit of work that produced the Dataset. May be
Noneto store no provenance information, but if present theQuantummust already have been added to the Registry.- recursive :
bool If True, recursively add Dataset and attach entries for component Datasets as well.
- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - ref :
DatasetRef A newly-created
DatasetRefinstance.
Raises: - ConflictingDefinitionError
If a Dataset with the given
DatasetRefalready exists in the given collection.- Exception
If
dataIdcontains unknown or invalidDimensionentries.
- datasetType :
-
addDatasetLocation(ref, datastoreName)¶ Add datastore name locating a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to add storage information.
- datastoreName :
str Name of the datastore holding this dataset.
Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- ref :
-
addDimensionEntry(dimension, dataId=None, entry=None, **kwds)¶ Add a new
Dimensionentry.- dimension :
strorDimension - Either a
Dimensionobject or the name of one. - dataId :
dictorDataId, optional - A
dict-like object containing theDimensionlinks that form the primary key of the row to insert. If this is a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry. - entry :
dict - Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
If
valuesincludes a “region” key,setDimensionRegionwill automatically be called to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.Returns: - dataId :
DataId A Data ID for exactly the given dimension that includes the added entry.
Raises: - dimension :
-
addDimensionEntryList(dimension, dataIdList, entry=None, **kwds)¶ Add a new
Dimensionentry.- dimension :
strorDimension - Either a
Dimensionobject or the name of one. - dataId :
listofdictorDataId - A list of
dict-like objects containing theDimensionlinks that form the primary key of the rows to insert. If these are a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry. - entry :
dict - Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
If
valuesincludes a “region” key, regions will automatically be added to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.Returns: - dataId :
DataId A Data ID for exactly the given dimension that includes the added entry.
Raises: - dimension :
-
addExecution(execution)¶ Add a new
Executionto theRegistry.If
execution.idisNonetheRegistrywill update it to that of the newly inserted entry.Parameters: - execution :
Execution Instance to add to the
Registry. The givenExecutionmust not already be present in theRegistry.
Raises: - ConflictingDefinitionError
If
executionis already present in theRegistry.
- execution :
-
addRun(run)¶ Add a new
Runto theRegistry.Parameters: Raises: - ConflictingDefinitionError
If a run already exists with this collection.
-
associate(collection, refs)¶ Add existing Datasets to a collection, implicitly creating the collection if it does not already exist.
If a DatasetRef with the same exact
dataset_idis already in a collection nothing is changed. If aDatasetRefwith the sameDatasetType1and dimension values but with differentdataset_idexists in the collection,ValueErroris raised.Parameters: - collection :
str Indicates the collection the Datasets should be associated with.
- refs : iterable of
DatasetRef An iterable of
DatasetRefinstances that already exist in thisRegistry. All component datasets will be associated with the collection as well.
Raises: - ConflictingDefinitionError
If a Dataset with the given
DatasetRefalready exists in the given collection.
- collection :
-
attachComponent(name, parent, component)¶ Attach a component to a dataset.
Parameters: - name :
str Name of the component.
- parent :
DatasetRef A reference to the parent dataset. Will be updated to reference the component.
- component :
DatasetRef A reference to the component dataset.
Raises: - AmbiguousDatasetError
Raised if
parent.idorcomponent.idisNone.
- name :
-
disassociate(collection, refs)¶ Remove existing Datasets from a collection.
collectionandrefcombinations that are not currently associated are silently ignored.Parameters: Raises: - AmbiguousDatasetError
Raised if
any(ref.id is None for ref in refs).
-
ensureRun(run)¶ Conditionally add a new
Runto theRegistry.If the
run.idisNoneor aRunwith thisiddoesn’t exist in theRegistryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.Parameters: - run :
Run Instance to add to the
Registry.
Raises: - ConflictingDefinitionError
If
runalready exists, but is not identical.
- run :
-
expandDataId(dataId=None, *, dimension=None, metadata=None, region=False, update=False, **kwds)¶ Expand a data ID to include additional information.
expandDataIdalways returns a trueDataIdand ensures that itsentriesdict contains (at least) values for all implied dependencies.Parameters: - dataId :
dictorDataId A
dict-like object containing theDimensionlinks that include the primary keys of the rows to query. If this is a trueDataId, the object will be updated in-place.- dimension :
Dimensionorstr A dimension passed to the
DataIdconstructor to create a trueDataIdor augment an existing one.- metadata :
collections.abc.Mapping, optional A mapping from
Dimensionorstrname to column name, indicating fields to read intodataId.entries. Ifdimensionis provided, may instead be a sequence of column names for that dimension.- region :
bool If
Trueand the givenDataIdis uniquely associated with a region on the sky, obtain that region from theRegistryand attach it asdataId.region.- update :
bool If
True, assume existing entries and regions in the givenDataIdare out-of-date and should be updated by values in the database. IfFalse, existing values will be assumed to be correct and database queries will only be executed if they are missing.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - dataId :
DataId A Data ID with all requested data populated.
Raises: - dataId :
-
find(collection, datasetType, dataId=None, **kwds)¶ Lookup a dataset.
This can be used to obtain a
DatasetRefthat permits the dataset to be read from aDatastore.Parameters: - collection :
str Identifies the collection to search.
- datasetType :
DatasetTypeorstr A
DatasetTypeor the name of one.- dataId :
dictorDataId, optional A
dict-like object containing theDimensionlinks that identify the dataset within a collection.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
Raises: - LookupError
If one or more data ID keys are missing.
- collection :
-
findDimensionEntries(dimension)¶ Return all
Dimensionentries corresponding to the named dimension.Parameters: - dimension :
strorDimension Either a
Dimensionobject or the name of one.
Returns: Raises: - dimension :
-
findDimensionEntry(dimension, dataId=None, **kwds)¶ Return a
Dimensionentry corresponding to aDataId.Parameters: - dimension :
strorDimension Either a
Dimensionobject or the name of one.- dataId :
dictorDataId, optional A
dict-like object containing theDimensionlinks that form the primary key of the row to retreive. If this is a fullDataIdobject,dataId.entries[dimension]will be updated with the entry obtained from theRegistry.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: Raises: - dimension :
-
static
fromConfig(registryConfig, schemaConfig=None, dimensionConfig=None, create=False, butlerRoot=None)¶ Create
Registrysubclass instance fromconfig.Uses
registry.clsfromconfigto determine which subclass to instantiate.Parameters: - registryConfig :
ButlerConfig,RegistryConfig,Configorstr Registry configuration
- schemaConfig :
SchemaConfig,Configorstr, optional. Schema configuration. Can be read from supplied registryConfig if the relevant component is defined and
schemaConfigisNone.- dimensionConfig :
DimensionConfigorConfigor str, optional.DimensionGraphconfiguration. Can be read from supplied registryConfig if the relevant component is defined anddimensionConfigisNone.- create :
bool Assume empty Registry and create a new one.
Returns: - registry :
Registry(subclass) A new
Registrysubclass instance.
- registryConfig :
-
getAllCollections()¶ Get names of all the collections found in this repository.
Returns:
-
getAllDatasetTypes()¶ Get every registered
DatasetType.Returns: - types :
frozensetofDatasetType Every
DatasetTypein the registry.
- types :
-
getDataset(id, datasetType=None, dataId=None)¶ Retrieve a Dataset entry.
Parameters: - id :
int The unique identifier for the Dataset.
- datasetType :
DatasetType, optional The
DatasetTypeof the dataset to retrieve. This is used to short-circuit retrieving theDatasetType, so if provided, the caller is guaranteeing that it is what would have been retrieved.- dataId :
DataId, optional A
Dimension-based identifier for the dataset within a collection, possibly containing additional metadata. This is used to short-circuit retrieving theDataId, so if provided, the caller is guaranteeing that it is what would have been retrieved.
Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
- id :
-
getDatasetLocations(ref)¶ Retrieve datastore locations for a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to retrieve storage information.
Returns: Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- ref :
-
getDatasetType(name)¶ Get the
DatasetType.Parameters: - name :
str Name of the type.
Returns: - type :
DatasetType The
DatasetTypeassociated with the given name.
Raises: - KeyError
Requested named DatasetType could not be found in registry.
- name :
-
getExecution(id)¶ Retrieve an Execution.
Parameters: - id :
int The unique identifier for the Execution.
- id :
-
getRun(id=None, collection=None)¶ Get a
Runcorresponding to its collection or idParameters: Returns: - run :
Run The
Runinstance.
Raises: - ValueError
Must supply one of
collectionorid.
- run :
-
makeDataIdPacker(name, dataId=None, **kwds)¶ Create an object that can pack certain data IDs into integers.
Parameters: Returns: - packer :
DataIdPacker Instance of a subclass of
DataIdPacker.
- packer :
-
makeDatabaseDict(table, types, key, value, lengths=None)¶ Construct a DatabaseDict backed by a table in the same database as this Registry.
Parameters: - table :
table Name of the table that backs the returned DatabaseDict. If this table already exists, its schema must include at least everything in
types.- types :
dict A dictionary mapping
strfield names to type objects, containing all fields to be held in the database.- key :
str The name of the field to be used as the dictionary key. Must not be present in
value._fields.- value :
type The type used for the dictionary’s values, typically a
namedtuple. Must have a_fieldsclass attribute that is a tuple of field names (i.e. as defined bynamedtuple); these field names must also appear in thetypesarg, and a_makeattribute to construct it from a sequence of values (again, as defined bynamedtuple).- lengths :
dict, optional Specific lengths of string fields. Defaults will be used if not specified.
Returns: - databaseDict :
DatabaseDict DatabaseDictbacked by this registry.
- table :
-
makeRun(collection)¶ Create a new
Runin theRegistryand return it.If a run with this collection already exists, return that instead.
Parameters: - collection :
str The collection used to identify all inputs and outputs of the
Run.
Returns: - run :
Run A new
Runinstance.
- collection :
-
packDataId(name, dataId=None, *, returnMaxBits=False, **kwds)¶ Pack the given
DataIdinto an integer.Parameters: - name :
str Name of the packer, as given in the
Registryconfiguration.- dataId :
dictorDataId, optional Data ID that identifies at least the “required” dimensions of the packer.
- returnMaxBits :
bool If
True, return a tuple of(packed, self.maxBits).- kwds
Addition keyword arguments used to augment or override the given data ID.
Returns: - name :
-
query(sql, **params)¶ Execute a SQL SELECT statement directly.
Named parameters are specified in the SQL query string by preceeding them with a colon. Parameter values are provided as additional keyword arguments. For example:
- registry.query(“SELECT * FROM instrument WHERE instrument=:name”,
- name=”HSC”)
Parameters: - sql :
str SQL query string. Must be a SELECT statement.
- **params
Parameter name-value pairs to insert into the query.
Yields: - row :
dict The next row result from executing the query.
-
registerDatasetType(datasetType)¶ Add a new
DatasetTypeto the Registry.It is not an error to register the same
DatasetTypetwice.Parameters: - datasetType :
DatasetType The
DatasetTypeto be added.
Returns: Raises: - ValueError
Raised if the dimensions or storage class are invalid.
- ConflictingDefinitionError
Raised if this DatasetType is already registered with a different definition.
- datasetType :
-
removeDataset(ref)¶ Remove a dataset from the Registry.
The dataset and all components will be removed unconditionally from all collections, and any associated
Quantumrecords will also be removed.Datastorerecords will not be deleted; the caller is responsible for ensuring that the dataset has already been removed from all Datastores.Parameters: - ref :
DatasetRef Reference to the dataset to be removed. Must include a valid
idattribute, and should be considered invalidated upon return.
Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.- OrphanedRecordError
Raised if the dataset is still present in any
Datastore.
- ref :
-
removeDatasetLocation(datastoreName, ref)¶ Remove datastore location associated with this dataset.
Typically used by
Datastorewhen a dataset is removed.Parameters: - datastoreName :
str Name of this
Datastore.- ref :
DatasetRef A reference to the dataset for which information is to be removed.
Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- datastoreName :
-
classmethod
setConfigRoot(root, config, full, overwrite=True)¶ Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root.
Parameters: - root :
str Filesystem path to the root of the data repository.
- config :
Config A
Configto update. Only the subset understood by this component will be updated. Will not expand defaults.- full :
Config A complete config with all defaults expanded that can be converted to a
RegistryConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied fromfulltoconfig.- overwrite :
bool, optional If
False, do not modify a value inconfigif the value already exists. Default is always to overwrite with the providedroot.
Notes
If a keyword is explicitly defined in the supplied
configit will not be overridden by this method ifoverwriteisFalse. This allows explicit values set in external configs to be retained.- root :
-
setDimensionRegion(dataId=None, *, update=True, region=None, **kwds)¶ Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables.
Parameters: - dataId :
dictorDataId A
dict-like object containing theDimensionlinks that form the primary key of the row to insert or update. If this is a fullDataId,dataId.regionwill be set toregion(ifregionis notNone) and then used to update or insert into theRegistry.- update :
bool If True, existing region information for these Dimensions is being replaced. This is usually required because Dimension entries are assumed to be pre-inserted prior to calling this function.
- region :
lsst.sphgeom.ConvexPolygon, optional The region to update or insert into the
Registry. If not presentdataId.regionmust not beNone.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - dataId :
DataId A Data ID with its
regionattribute set.
Raises: - dataId :
-
transaction()¶ Context manager that implements SQL transactions.
Will roll back any changes to the
SqlRegistrydatabase in case an exception is raised in the enclosed block.This context manager may be nested.
- registryConfig :