Registry¶
-
class
lsst.daf.butler.Registry(registryConfig, schemaConfig=None, dimensionConfig=None, create=False, butlerRoot=None)¶ Bases:
objectRegistry interface.
Parameters: - registryConfig :
RegistryConfig Registry configuration.
- schemaConfig :
SchemaConfig, optional Schema configuration.
- dimensionConfig :
DimensionConfigorConfigor DimensionGraphconfiguration.
Attributes Summary
defaultConfigFilePath to configuration defaults. limitedIf True, this Registry does not maintain Dimension metadata or relationships ( bool).pixelizationObject that interprets skypix Dimension values ( lsst.sphgeom.Pixelization).Methods Summary
addDataset(datasetType, dataId, run[, …])Adds a Dataset entry to the RegistryaddDatasetLocation(ref, datastoreName)Add datastore name locating a given dataset. addDimensionEntry(dimension[, dataId, entry])Add a new Dimensionentry.addDimensionEntryList(dimension, dataIdList)Add a new Dimensionentry.addExecution(execution)Add a new Executionto theRegistry.addRun(run)Add a new Runto theRegistry.associate(collection, refs)Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. attachComponent(name, parent, component)Attach a component to a dataset. disassociate(collection, refs)Remove existing Datasets from a collection. ensureRun(run)Conditionally add a new Runto theRegistry.expandDataId([dataId, dimension, metadata, …])Expand a data ID to include additional information. find(collection, datasetType[, dataId])Lookup a dataset. findDimensionEntries(dimension)Return all Dimensionentries corresponding to the named dimension.findDimensionEntry(dimension[, dataId])Return a Dimensionentry corresponding to aDataId.fromConfig(registryConfig[, schemaConfig, …])Create Registrysubclass instance fromconfig.getAllCollections()Get names of all the collections found in this repository. getAllDatasetTypes()Get every registered DatasetType.getDataset(id[, datasetType, dataId])Retrieve a Dataset entry. getDatasetLocations(ref)Retrieve datastore locations for a given dataset. getDatasetType(name)Get the DatasetType.getExecution(id)Retrieve an Execution. getRun([id, collection])Get a Runcorresponding to its collection or idmakeDataIdPacker(name[, dataId])Create an object that can pack certain data IDs into integers. makeDatabaseDict(table, types, key, value[, …])Construct a DatabaseDictbacked by a table in the same database as this Registry.makeRun(collection)Create a new Runin theRegistryand return it.packDataId(name[, dataId, returnMaxBits])Pack the given DataIdinto an integer.registerDatasetType(datasetType)Add a new DatasetTypeto the Registry.removeDataset(ref)Remove a dataset from the Registry. removeDatasetLocation(datastoreName, ref)Remove datastore location associated with this dataset. setConfigRoot(root, config, full[, overwrite])Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. setDimensionRegion([dataId, update, region])Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables. transaction()Optionally implemented in Registrysubclasses to provide exception safety guarantees in case an exception is raised in the enclosed block.Attributes Documentation
-
defaultConfigFile= None¶ Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified.
-
pixelization¶ Object that interprets skypix Dimension values (
lsst.sphgeom.Pixelization).Nonefor limited registries.
Methods Documentation
-
addDataset(datasetType, dataId, run, producer=None, recursive=False, **kwds)¶ Adds a Dataset entry to the
RegistryThis always adds a new Dataset; to associate an existing Dataset with a new collection, use
associate.Parameters: - datasetType :
DatasetTypeorstr A
DatasetTypeor the name of one.- dataId :
dictorDataId A
dict-like object containing theDimensionlinks that identify the dataset within a collection.- run :
Run The
Runinstance that produced the Dataset. Ignored ifproduceris passed (producer.runis then used instead). A Run must be provided by one of the two arguments.- producer :
Quantum Unit of work that produced the Dataset. May be
Noneto store no provenance information, but if present theQuantummust already have been added to the Registry.- recursive :
bool If True, recursively add Dataset and attach entries for component Datasets as well.
- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - ref :
DatasetRef A newly-created
DatasetRefinstance.
Raises: - ConflictingDefinitionError
If a Dataset with the given
DatasetRefalready exists in the given collection.- Exception
If
dataIdcontains unknown or invalidDimensionentries.
- datasetType :
-
addDatasetLocation(ref, datastoreName)¶ Add datastore name locating a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to add storage information.
- datastoreName :
str Name of the datastore holding this dataset.
Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- ref :
-
addDimensionEntry(dimension, dataId=None, entry=None, **kwds)¶ Add a new
Dimensionentry.- dimension :
strorDimension - Either a
Dimensionobject or the name of one. - dataId :
dictorDataId, optional - A
dict-like object containing theDimensionlinks that form the primary key of the row to insert. If this is a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry. - entry :
dict - Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
If
valuesincludes a “region” key,setDimensionRegionwill automatically be called to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.Returns: - dataId :
DataId A Data ID for exactly the given dimension that includes the added entry.
Raises: - dimension :
-
addDimensionEntryList(dimension, dataIdList, entry=None, **kwds)¶ Add a new
Dimensionentry.- dimension :
strorDimension - Either a
Dimensionobject or the name of one. - dataId :
listofdictorDataId - A list of
dict-like objects containing theDimensionlinks that form the primary key of the rows to insert. If these are a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry. - entry :
dict - Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
If
valuesincludes a “region” key, regions will automatically be added to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.Returns: - dataId :
DataId A Data ID for exactly the given dimension that includes the added entry.
Raises: - dimension :
-
addExecution(execution)¶ Add a new
Executionto theRegistry.If
execution.idisNonetheRegistrywill update it to that of the newly inserted entry.Parameters: Raises: - ConflictingDefinitionError
If
executionis already present in theRegistry.
-
addRun(run)¶ Add a new
Runto theRegistry.Parameters: Raises: - ConflictingDefinitionError
If a run already exists with this collection.
-
associate(collection, refs)¶ Add existing Datasets to a collection, implicitly creating the collection if it does not already exist.
If a DatasetRef with the same exact
dataset_idis already in a collection nothing is changed. If aDatasetRefwith the sameDatasetType1and dimension values but with differentdataset_idexists in the collection,ValueErroris raised.Parameters: - collection :
str Indicates the collection the Datasets should be associated with.
- refs : iterable of
DatasetRef An iterable of
DatasetRefinstances that already exist in thisRegistry. All component datasets will be associated with the collection as well.
Raises: - ConflictingDefinitionError
If a Dataset with the given
DatasetRefalready exists in the given collection.
- collection :
-
attachComponent(name, parent, component)¶ Attach a component to a dataset.
Parameters: - name :
str Name of the component.
- parent :
DatasetRef A reference to the parent dataset. Will be updated to reference the component.
- component :
DatasetRef A reference to the component dataset.
Raises: - AmbiguousDatasetError
Raised if
parent.idorcomponent.idisNone.
- name :
-
disassociate(collection, refs)¶ Remove existing Datasets from a collection.
collectionandrefcombinations that are not currently associated are silently ignored.Parameters: - collection :
str The collection the Datasets should no longer be associated with.
- refs :
listofDatasetRef A
listofDatasetRefinstances that already exist in thisRegistry. All component datasets will also be removed.
Raises: - AmbiguousDatasetError
Raised if
any(ref.id is None for ref in refs).
- collection :
-
ensureRun(run)¶ Conditionally add a new
Runto theRegistry.If the
run.idisNoneor aRunwith thisiddoesn’t exist in theRegistryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.Parameters: Raises: - ConflictingDefinitionError
If
runalready exists, but is not identical.
-
expandDataId(dataId=None, *, dimension=None, metadata=None, region=False, update=False, **kwds)¶ Expand a data ID to include additional information.
expandDataIdalways returns a trueDataIdand ensures that itsentriesdict contains (at least) values for all implied dependencies.Parameters: - dataId :
dictorDataId A
dict-like object containing theDimensionlinks that include the primary keys of the rows to query. If this is a trueDataId, the object will be updated in-place.- dimension :
Dimensionorstr A dimension passed to the
DataIdconstructor to create a trueDataIdor augment an existing one.- metadata :
collections.abc.Mapping, optional A mapping from
Dimensionorstrname to column name, indicating fields to read intodataId.entries. Ifdimensionis provided, may instead be a sequence of column names for that dimension.- region :
bool If
Trueand the givenDataIdis uniquely associated with a region on the sky, obtain that region from theRegistryand attach it asdataId.region.- update :
bool If
True, assume existing entries and regions in the givenDataIdare out-of-date and should be updated by values in the database. IfFalse, existing values will be assumed to be correct and database queries will only be executed if they are missing.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - dataId :
DataId A Data ID with all requested data populated.
Raises: - dataId :
-
find(collection, datasetType, dataId=None, **kwds)¶ Lookup a dataset.
This can be used to obtain a
DatasetRefthat permits the dataset to be read from aDatastore.Parameters: - collection :
str Identifies the collection to search.
- datasetType :
DatasetTypeorstr A
DatasetTypeor the name of one.- dataId :
dictorDataId, optional A
dict-like object containing theDimensionlinks that identify the dataset within a collection.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
Raises: - LookupError
If one or more data ID keys are missing.
- collection :
-
findDimensionEntries(dimension)¶ Return all
Dimensionentries corresponding to the named dimension.Parameters: Returns: Raises:
-
findDimensionEntry(dimension, dataId=None, **kwds)¶ Return a
Dimensionentry corresponding to aDataId.Parameters: - dimension :
strorDimension Either a
Dimensionobject or the name of one.- dataId :
dictorDataId, optional A
dict-like object containing theDimensionlinks that form the primary key of the row to retreive. If this is a fullDataIdobject,dataId.entries[dimension]will be updated with the entry obtained from theRegistry.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: Raises: - dimension :
-
static
fromConfig(registryConfig, schemaConfig=None, dimensionConfig=None, create=False, butlerRoot=None)¶ Create
Registrysubclass instance fromconfig.Uses
registry.clsfromconfigto determine which subclass to instantiate.Parameters: - registryConfig :
ButlerConfig,RegistryConfig,Configorstr Registry configuration
- schemaConfig :
SchemaConfig,Configorstr, optional. Schema configuration. Can be read from supplied registryConfig if the relevant component is defined and
schemaConfigisNone.- dimensionConfig :
DimensionConfigorConfigor str, optional.DimensionGraphconfiguration. Can be read from supplied registryConfig if the relevant component is defined anddimensionConfigisNone.- create :
bool Assume empty Registry and create a new one.
Returns: - registryConfig :
-
getAllCollections()¶ Get names of all the collections found in this repository.
Returns:
-
getAllDatasetTypes()¶ Get every registered
DatasetType.Returns: - types :
frozensetofDatasetType Every
DatasetTypein the registry.
- types :
-
getDataset(id, datasetType=None, dataId=None)¶ Retrieve a Dataset entry.
Parameters: - id :
int The unique identifier for the Dataset.
- datasetType :
DatasetType, optional The
DatasetTypeof the dataset to retrieve. This is used to short-circuit retrieving theDatasetType, so if provided, the caller is guaranteeing that it is what would have been retrieved.- dataId :
DataId, optional A
Dimension-based identifier for the dataset within a collection, possibly containing additional metadata. This is used to short-circuit retrieving theDataId, so if provided, the caller is guaranteeing that it is what would have been retrieved.
Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
- id :
-
getDatasetLocations(ref)¶ Retrieve datastore locations for a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to retrieve storage information.
Returns: Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- ref :
-
getDatasetType(name)¶ Get the
DatasetType.Parameters: - name :
str Name of the type.
Returns: - type :
DatasetType The
DatasetTypeassociated with the given name.
Raises: - KeyError
Requested named DatasetType could not be found in registry.
- name :
-
getExecution(id)¶ Retrieve an Execution.
Parameters: - id :
int The unique identifier for the Execution.
- id :
-
getRun(id=None, collection=None)¶ Get a
Runcorresponding to its collection or idParameters: Returns: Raises: - ValueError
Must supply one of
collectionorid.
-
makeDataIdPacker(name, dataId=None, **kwds)¶ Create an object that can pack certain data IDs into integers.
Parameters: Returns: - packer :
DataIdPacker Instance of a subclass of
DataIdPacker.
- packer :
-
makeDatabaseDict(table, types, key, value, lengths=None)¶ Construct a
DatabaseDictbacked by a table in the same database as this Registry.Parameters: - table :
table Name of the table that backs the returned
DatabaseDict. If this table already exists, its schema must include at least everything intypes.- types :
dict A dictionary mapping
strfield names to type objects, containing all fields to be held in the database.- key :
str The name of the field to be used as the dictionary key. Must not be present in
value._fields.- value :
type The type used for the dictionary’s values, typically a
namedtuple. Must have a_fieldsclass attribute that is a tuple of field names (i.e. as defined bynamedtuple); these field names must also appear in thetypesarg, and a_makeattribute to construct it from a sequence of values (again, as defined bynamedtuple).- lengths :
dict, optional Specific lengths of string fields. Defaults will be used if not specified.
Returns: - databaseDict :
DatabaseDict DatabaseDictbacked by this registry.
- table :
-
makeRun(collection)¶ Create a new
Runin theRegistryand return it.If a run with this collection already exists, return that instead.
Parameters: Returns:
-
packDataId(name, dataId=None, *, returnMaxBits=False, **kwds)¶ Pack the given
DataIdinto an integer.Parameters: - name :
str Name of the packer, as given in the
Registryconfiguration.- dataId :
dictorDataId, optional Data ID that identifies at least the “required” dimensions of the packer.
- returnMaxBits :
bool If
True, return a tuple of(packed, self.maxBits).- kwds
Addition keyword arguments used to augment or override the given data ID.
Returns: - name :
-
registerDatasetType(datasetType)¶ Add a new
DatasetTypeto the Registry.It is not an error to register the same
DatasetTypetwice.Parameters: - datasetType :
DatasetType The
DatasetTypeto be added.
Returns: Raises: - ValueError
Raised if the dimensions or storage class are invalid.
- ConflictingDefinitionError
Raised if this DatasetType is already registered with a different definition.
- datasetType :
-
removeDataset(ref)¶ Remove a dataset from the Registry.
The dataset and all components will be removed unconditionally from all collections, and any associated
Quantumrecords will also be removed.Datastorerecords will not be deleted; the caller is responsible for ensuring that the dataset has already been removed from all Datastores.Parameters: - ref :
DatasetRef Reference to the dataset to be removed. Must include a valid
idattribute, and should be considered invalidated upon return.
Raises: - ref :
-
removeDatasetLocation(datastoreName, ref)¶ Remove datastore location associated with this dataset.
Typically used by
Datastorewhen a dataset is removed.Parameters: - datastoreName :
str Name of this
Datastore.- ref :
DatasetRef A reference to the dataset for which information is to be removed.
Raises: - AmbiguousDatasetError
Raised if
ref.idisNone.
- datastoreName :
-
classmethod
setConfigRoot(root, config, full, overwrite=True)¶ Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root.
Parameters: - root :
str Filesystem path to the root of the data repository.
- config :
Config A
Configto update. Only the subset understood by this component will be updated. Will not expand defaults.- full :
Config A complete config with all defaults expanded that can be converted to a
RegistryConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied fromfulltoconfig.- overwrite :
bool, optional If
False, do not modify a value inconfigif the value already exists. Default is always to overwrite with the providedroot.
Notes
If a keyword is explicitly defined in the supplied
configit will not be overridden by this method ifoverwriteisFalse. This allows explicit values set in external configs to be retained.- root :
-
setDimensionRegion(dataId=None, *, update=True, region=None, **kwds)¶ Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables.
Parameters: - dataId :
dictorDataId A
dict-like object containing theDimensionlinks that form the primary key of the row to insert or update. If this is a fullDataId,dataId.regionwill be set toregion(ifregionis notNone) and then used to update or insert into theRegistry.- update :
bool If True, existing region information for these Dimensions is being replaced. This is usually required because Dimension entries are assumed to be pre-inserted prior to calling this function.
- region :
lsst.sphgeom.ConvexPolygon, optional The region to update or insert into the
Registry. If not presentdataId.regionmust not beNone.- kwds
Additional keyword arguments passed to the
DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
Returns: - dataId :
DataId A Data ID with its
regionattribute set.
Raises: - dataId :
-
transaction()¶ Optionally implemented in
Registrysubclasses to provide exception safety guarantees in case an exception is raised in the enclosed block.This context manager may be nested (e.g. any implementation by a
Registrysubclass must nest properly).Warning
The level of exception safety is not guaranteed by this API. It may implement stong exception safety and roll back any changes leaving the state unchanged, or it may do nothing leaving the underlying
Registrycorrupted. Depending on the implementation in the subclass.
- registryConfig :