SqlRegistry¶
-
class
lsst.daf.butler.registries.sqlRegistry.SqlRegistry(registryConfig, schemaConfig, create=False)¶ Bases:
lsst.daf.butler.RegistryRegistry backed by a SQL database.
Parameters: - registryConfig :
SqlRegistryConfigorstr Load configuration
- schemaConfig :
SchemaConfigorstr Definition of the schema to use.
- create :
bool Assume registry is empty and create a new one.
Attributes Summary
defaultConfigFilePath to configuration defaults. Methods Summary
addDataUnitEntry(dataUnitName, values)Add a new DataUnitentry.addDataset(datasetType, dataId, run[, …])Adds a Dataset entry to the RegistryaddDatasetLocation(ref, datastoreName)Add datastore name locating a given dataset. addExecution(execution)Add a new Executionto theSqlRegistry.addQuantum(quantum)Add a new Quantumto theSqlRegistry.addRun(run)Add a new Runto theSqlRegistry.associate(collection, refs)Add existing Datasets to a collection, possibly creating the collection in the process. attachComponent(name, parent, component)Attach a component to a dataset. disassociate(collection, refs[, remove])Remove existing Datasets from a collection. ensureRun(run)Conditionally add a new Runto theSqlRegistry.export(expr)Export contents of the SqlRegistry, limited to those reachable from the Datasets identified by the expressionexpr, into aTableSetformat such that it can be imported into a different database.find(collection, datasetType, dataId)Lookup a dataset. findDataUnitEntry(dataUnitName, value)Return a DataUnitentry corresponding to avalue.getDataUnitDefinition(dataUnitName)Return the definition of a DataUnit (an actual DataUnitobject).getDataset(id)Retrieve a Dataset entry. getDatasetLocations(ref)Retrieve datastore locations for a given dataset. getDatasetType(name)Get the DatasetType.getExecution(id)Retrieve an Execution. getQuantum(id)Retrieve an Quantum. getRegion(dataId)Get region associated with a dataId. getRun([id, collection])Get a Runcorresponding to its collection or idimport_(tables, collection)Import (previously exported) contents into the (possibly empty) SqlRegistry.makeDatabaseDict(table, types, key, value)Construct a DatabaseDict backed by a table in the same database as this Registry. makeProvenanceGraph(expr[, types])Make a QuantumGraphthat contains the full provenance of all Datasets matching an expression.makeRun(collection)Create a new Runin theSqlRegistryand return it.markInputUsed(quantum, ref)Record the given DatasetRefas an actual (not just predicted) input of the givenQuantum.merge(outputCollection, inputCollections)Create a new collection from a series of existing ones. query(sql, **params)Execute a SQL SELECT statement directly. registerDatasetType(datasetType)Add a new DatasetTypeto the SqlRegistry.removeDatasetLocation(datastoreName, ref)Remove datastore location associated with this dataset. selectDataUnits(collections, expr, …)Evaluate a filter expression and lists of DatasetTypes and return a set of data unit values.setDataUnitRegion(dataUnitNames, value, region)Set the region field for a DataUnit instance or a combination thereof and update associated spatial join tables. subset(collection, expr, datasetTypes)Create a new collection by subsetting an existing one. transaction()Context manager that implements SQL transactions. Attributes Documentation
-
defaultConfigFile= None¶ Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified.
Methods Documentation
-
addDataUnitEntry(dataUnitName, values)¶ Add a new
DataUnitentry.- dataUnitName :
str - Name of the
DataUnit(e.g."Camera"). - values :
dict - Dictionary of
columnName, columnValuepairs.
If
valuesincludes a “region” key,setDataUnitRegionwill automatically be called to set it any associated spatial join tables. Region fields associated with a combination of DataUnits must be explicitly set separately.Raises: - TypeError
If the given
DataUnitdoes not have explicit entries in the registry.- ValueError
If an entry with the primary-key defined in
valuesis already present.
- dataUnitName :
-
addDataset(datasetType, dataId, run, producer=None, recursive=False)¶ Adds a Dataset entry to the
RegistryThis always adds a new Dataset; to associate an existing Dataset with a new collection, use
associate.Parameters: - datasetType :
DatasetType Type of the Dataset.
- dataId :
dict A
dictofDataUnitlink name, value pairs that label theDatasetRefwithin a collection.- run :
Run The
Runinstance that produced the Dataset. Ignored ifproduceris passed (producer.runis then used instead). A Run must be provided by one of the two arguments.- producer :
Quantum Unit of work that produced the Dataset. May be
Noneto store no provenance information, but if present theQuantummust already have been added to the SqlRegistry.- recursive :
bool If True, recursively add Dataset and attach entries for component Datasets as well.
Returns: - ref :
DatasetRef A newly-created
DatasetRefinstance.
Raises: - ValueError
If a Dataset with the given
DatasetRefalready exists in the given collection.- Exception
If
dataIdcontains unknown or invalidDataUnitentries.
- datasetType :
-
addDatasetLocation(ref, datastoreName)¶ Add datastore name locating a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to add storage information.
- datastoreName :
str Name of the datastore holding this dataset.
- ref :
-
addExecution(execution)¶ Add a new
Executionto theSqlRegistry.If
execution.idisNonetheSqlRegistrywill update it to that of the newly inserted entry.Parameters: - execution :
Execution Instance to add to the
SqlRegistry. The givenExecutionmust not already be present in theSqlRegistry.
Raises: - Exception
If
Executionis already present in theSqlRegistry.
- execution :
-
addQuantum(quantum)¶ Add a new
Quantumto theSqlRegistry.Parameters: - quantum :
Quantum Instance to add to the
SqlRegistry. The givenQuantummust not already be present in theSqlRegistry(or any other), therefore its:runattribute must be set to an existingRun.predictedInputsattribute must be fully populated withDatasetRefs, and its.actualInputsandoutputswill be ignored.
- quantum :
-
addRun(run)¶ Add a new
Runto theSqlRegistry.Parameters: - run :
Run Instance to add to the
SqlRegistry. The givenRunmust not already be present in theSqlRegistry(or any other). Therefore itsidmust beNoneand itscollectionmust not be associated with any existingRun.
Raises: - ValueError
If a run already exists with this collection.
- run :
-
associate(collection, refs)¶ Add existing Datasets to a collection, possibly creating the collection in the process.
Parameters: - collection :
str Indicates the collection the Datasets should be associated with.
- refs :
listofDatasetRef A
listofDatasetRefinstances that already exist in thisSqlRegistry.
- collection :
-
attachComponent(name, parent, component)¶ Attach a component to a dataset.
Parameters: - name :
str Name of the component.
- parent :
DatasetRef A reference to the parent dataset. Will be updated to reference the component.
- component :
DatasetRef A reference to the component dataset.
- name :
-
disassociate(collection, refs, remove=True)¶ Remove existing Datasets from a collection.
collectionandrefcombinations that are not currently associated are silently ignored.Parameters: - collection :
str The collection the Datasets should no longer be associated with.
- refs :
listofDatasetRef A
listofDatasetRefinstances that already exist in thisSqlRegistry.- remove :
bool If
True, remove Datasets from theSqlRegistryif they are not associated with any collection (including via any composites).
Returns: - collection :
-
ensureRun(run)¶ Conditionally add a new
Runto theSqlRegistry.If the
run.idisNoneor aRunwith thisiddoesn’t exist in theRegistryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.Parameters: - run :
Run Instance to add to the
SqlRegistry.
Raises: - ValueError
If
runalready exists, but is not identical.
- run :
-
export(expr)¶ Export contents of the
SqlRegistry, limited to those reachable from the Datasets identified by the expressionexpr, into aTableSetformat such that it can be imported into a different database.Parameters: - expr :
str An expression (SQL query that evaluates to a list of Dataset primary keys) that selects the
Datasets, or a `QuantumGraphthat can be similarly interpreted.
Returns: - ts :
TableSet Containing all rows, from all tables in the
SqlRegistrythat are reachable from the selected Datasets.
- expr :
-
find(collection, datasetType, dataId)¶ Lookup a dataset.
This can be used to obtain a
DatasetRefthat permits the dataset to be read from aDatastore.Parameters: Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
Raises: - ValueError
If dataId is invalid.
- ref :
-
findDataUnitEntry(dataUnitName, value)¶ Return a
DataUnitentry corresponding to avalue.Parameters: Returns:
-
getDataUnitDefinition(dataUnitName)¶ Return the definition of a DataUnit (an actual
DataUnitobject).Parameters: - dataUnitName :
str Name of the DataUnit, e.g. “Camera”, “Tract”, etc.
- dataUnitName :
-
getDataset(id)¶ Retrieve a Dataset entry.
Parameters: - id :
int The unique identifier for the Dataset.
Returns: - ref :
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
- id :
-
getDatasetLocations(ref)¶ Retrieve datastore locations for a given dataset.
Typically used by
Datastore.Parameters: - ref :
DatasetRef A reference to the dataset for which to retrieve storage information.
Returns: - ref :
-
getDatasetType(name)¶ Get the
DatasetType.Parameters: - name :
str Name of the type.
Returns: - type :
DatasetType The
DatasetTypeassociated with the given name.
Raises: - KeyError
Requested named DatasetType could not be found in registry.
- name :
-
getExecution(id)¶ Retrieve an Execution.
Parameters: - id :
int The unique identifier for the Execution.
- id :
-
getRegion(dataId)¶ Get region associated with a dataId.
Parameters: Returns: - region :
lsst.sphgeom.ConvexPolygon The region associated with a
dataIdorNoneif not present.
Raises: - KeyError
If the set of dataunits for the
dataIddoes not correspond to a unique spatial lookup.
- region :
-
getRun(id=None, collection=None)¶ Get a
Runcorresponding to its collection or idParameters: Returns: - run :
Run The
Runinstance.
Raises: - ValueError
Must supply one of
collectionorid.
- run :
-
import_(tables, collection)¶ Import (previously exported) contents into the (possibly empty)
SqlRegistry.Parameters: - ts :
TableSet Contains the previously exported content.
- collection :
str An additional collection assigned to the newly imported Datasets.
- ts :
-
makeDatabaseDict(table, types, key, value)¶ Construct a DatabaseDict backed by a table in the same database as this Registry.
Parameters: - table :
table Name of the table that backs the returned DatabaseDict. If this table already exists, its schema must include at least everything in
types.- types :
dict A dictionary mapping
strfield names to type objects, containing all fields to be held in the database.- key :
str The name of the field to be used as the dictionary key. Must not be present in
value._fields.- value :
type The type used for the dictionary’s values, typically a
namedtuple. Must have a_fieldsclass attribute that is a tuple of field names (i.e. as defined bynamedtuple); these field names must also appear in thetypesarg, and a_makeattribute to construct it from a sequence of values (again, as defined bynamedtuple).
- table :
-
makeProvenanceGraph(expr, types=None)¶ Make a
QuantumGraphthat contains the full provenance of all Datasets matching an expression.Parameters: - expr :
str An expression (SQL query that evaluates to a list of Dataset primary keys) that selects the Datasets.
Returns: - graph :
QuantumGraph Instance (with
unitsset toNone).
- expr :
-
makeRun(collection)¶ Create a new
Runin theSqlRegistryand return it.If a run with this collection already exists, return that instead.
Parameters: - collection :
str The collection used to identify all inputs and outputs of the
Run.
Returns: - run :
Run A new
Runinstance.
- collection :
-
markInputUsed(quantum, ref)¶ Record the given
DatasetRefas an actual (not just predicted) input of the givenQuantum.This updates both the
SqlRegistry”sQuantumtable and the PythonQuantum.actualInputsattribute.Parameters: - quantum :
Quantum Producer to update. Will be updated in this call.
- ref :
DatasetRef To set as actually used input.
Raises: - KeyError
If
quantumis not a predicted consumer forref.
- quantum :
-
merge(outputCollection, inputCollections)¶ Create a new collection from a series of existing ones.
Entries earlier in the list will be used in preference to later entries when both contain Datasets with the same
DatasetRef.Parameters:
-
query(sql, **params)¶ Execute a SQL SELECT statement directly.
Named parameters are specified in the SQL query string by preceeding them with a colon. Parameter values are provided as additional keyword arguments. For example:
registry.query(“SELECT * FROM Camera WHERE camera=:name”, name=”HSC”)Parameters: - sql :
str SQL query string. Must be a SELECT statement.
- **params
Parameter name-value pairs to insert into the query.
Yields: - row :
dict The next row result from executing the query.
- sql :
-
registerDatasetType(datasetType)¶ Add a new
DatasetTypeto the SqlRegistry.It is not an error to register the same
DatasetTypetwice.Parameters: - datasetType :
DatasetType The
DatasetTypeto be added.
Returns: - inserted :
bool TrueifdatasetTypewas inserted,Falseif an identical existingDatsetTypewas found.
Raises: - ValueError
DatasetType is not valid for this registry or is already registered but not identical.
- datasetType :
-
removeDatasetLocation(datastoreName, ref)¶ Remove datastore location associated with this dataset.
Typically used by
Datastorewhen a dataset is removed.Parameters: - datastoreName :
str Name of this
Datastore.- ref :
DatasetRef A reference to the dataset for which information is to be removed.
- datastoreName :
-
selectDataUnits(collections, expr, neededDatasetTypes, futureDatasetTypes)¶ Evaluate a filter expression and lists of
DatasetTypes and return a set of data unit values.Returned set consists of combinations of units participating in data transformation from
neededDatasetTypestofutureDatasetTypes, restricted by existing data and filter expression.Parameters: - collections :
listofstr An ordered
listof collections indicating the collections to search for Datasets.- expr :
str An expression that limits the
DataUnits and (indirectly) the Datasets returned.- neededDatasetTypes :
listofDatasetType The
listofDatasetTypes whose instances should be included in the graph and limit its extent.- futureDatasetTypes :
listofDatasetType The
listofDatasetTypes whose instances may be added to the graph later, which requires that theirDataUnittypes must be present in the graph.
Returns: - header :
tupleoftuple Length of tuple equals the number of columns in the returned result set. Each item is a tuple with two elements - DataUnit name (e.g. “Visit”) and unit value name (e.g. “visit”).
- rows : sequence of
tuple Result set, this can be a single-pass iterator. Each tuple contains unit values corresponding to units in a header.
- collections :
-
setDataUnitRegion(dataUnitNames, value, region, update=True)¶ Set the region field for a DataUnit instance or a combination thereof and update associated spatial join tables.
Parameters: - dataUnitNames : sequence
A sequence of DataUnit names whose instances are jointly associated with a region on the sky. This must not include dependencies that are implied, e.g. “Patch” must not include “Tract”, but “Sensor” needs to add “Visit”.
- value :
dict A dictionary of values that uniquely identify the DataUnits.
- region :
sphgeom.ConvexPolygon Region on the sky.
- update :
bool If True, existing region information for these DataUnits is being replaced. This is usually required because DataUnit entries are assumed to be pre-inserted prior to calling this function.
-
subset(collection, expr, datasetTypes)¶ Create a new collection by subsetting an existing one.
Parameters: Returns: - collection :
str The newly created collection.
- collection :
-
transaction()¶ Context manager that implements SQL transactions.
Will roll back any changes to the
SqlRegistrydatabase in case an exception is raised in the enclosed block.This context manager may be nested.
- registryConfig :