SqlRegistry¶
- 
class 
lsst.daf.butler.registries.sqlRegistry.SqlRegistry(registryConfig, schemaConfig, create=False)¶ Bases:
lsst.daf.butler.RegistryRegistry backed by a SQL database.
Parameters: - registryConfig : 
SqlRegistryConfigorstr Load configuration
- schemaConfig : 
SchemaConfigorstr Definition of the schema to use.
- create : 
bool Assume registry is empty and create a new one.
Attributes Summary
defaultConfigFilePath to configuration defaults. Methods Summary
addDataUnitEntry(dataUnitName, values)Add a new DataUnitentry.addDataset(datasetType, dataId, run[, …])Adds a Dataset entry to the RegistryaddDatasetLocation(ref, datastoreName)Add datastore name locating a given dataset. addExecution(execution)Add a new Executionto theSqlRegistry.addQuantum(quantum)Add a new Quantumto theSqlRegistry.addRun(run)Add a new Runto theSqlRegistry.associate(collection, refs)Add existing Datasets to a collection, possibly creating the collection in the process. attachComponent(name, parent, component)Attach a component to a dataset. disassociate(collection, refs[, remove])Remove existing Datasets from a collection. ensureRun(run)Conditionally add a new Runto theSqlRegistry.export(expr)Export contents of the SqlRegistry, limited to those reachable from the Datasets identified by the expressionexpr, into aTableSetformat such that it can be imported into a different database.find(collection, datasetType, dataId)Lookup a dataset. findDataUnitEntry(dataUnitName, value)Return a DataUnitentry corresponding to avalue.getDataUnitDefinition(dataUnitName)Return the definition of a DataUnit (an actual DataUnitobject).getDataset(id)Retrieve a Dataset entry. getDatasetLocations(ref)Retrieve datastore locations for a given dataset. getDatasetType(name)Get the DatasetType.getExecution(id)Retrieve an Execution. getQuantum(id)Retrieve an Quantum. getRegion(dataId)Get region associated with a dataId. getRun([id, collection])Get a Runcorresponding to its collection or idimport_(tables, collection)Import (previously exported) contents into the (possibly empty) SqlRegistry.makeDatabaseDict(table, types, key, value)Construct a DatabaseDict backed by a table in the same database as this Registry. makeProvenanceGraph(expr[, types])Make a QuantumGraphthat contains the full provenance of all Datasets matching an expression.makeRun(collection)Create a new Runin theSqlRegistryand return it.markInputUsed(quantum, ref)Record the given DatasetRefas an actual (not just predicted) input of the givenQuantum.merge(outputCollection, inputCollections)Create a new collection from a series of existing ones. query(sql, **params)Execute a SQL SELECT statement directly. registerDatasetType(datasetType)Add a new DatasetTypeto the SqlRegistry.removeDatasetLocation(datastoreName, ref)Remove datastore location associated with this dataset. selectDataUnits(collections, expr, …)Evaluate a filter expression and lists of DatasetTypes and return a set of data unit values.setDataUnitRegion(dataUnitNames, value, region)Set the region field for a DataUnit instance or a combination thereof and update associated spatial join tables. subset(collection, expr, datasetTypes)Create a new collection by subsetting an existing one. transaction()Context manager that implements SQL transactions. Attributes Documentation
- 
defaultConfigFile= None¶ Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified.
Methods Documentation
- 
addDataUnitEntry(dataUnitName, values)¶ Add a new
DataUnitentry.- dataUnitName : 
str - Name of the 
DataUnit(e.g."Camera"). - values : 
dict - Dictionary of 
columnName, columnValuepairs. 
If
valuesincludes a “region” key,setDataUnitRegionwill automatically be called to set it any associated spatial join tables. Region fields associated with a combination of DataUnits must be explicitly set separately.Raises: - TypeError
 If the given
DataUnitdoes not have explicit entries in the registry.- ValueError
 If an entry with the primary-key defined in
valuesis already present.
- dataUnitName : 
 
- 
addDataset(datasetType, dataId, run, producer=None, recursive=False)¶ Adds a Dataset entry to the
RegistryThis always adds a new Dataset; to associate an existing Dataset with a new collection, use
associate.Parameters: - datasetType : 
DatasetType Type of the Dataset.
- dataId : 
dict A
dictofDataUnitlink name, value pairs that label theDatasetRefwithin a collection.- run : 
Run The
Runinstance that produced the Dataset. Ignored ifproduceris passed (producer.runis then used instead). A Run must be provided by one of the two arguments.- producer : 
Quantum Unit of work that produced the Dataset. May be
Noneto store no provenance information, but if present theQuantummust already have been added to the SqlRegistry.- recursive : 
bool If True, recursively add Dataset and attach entries for component Datasets as well.
Returns: - ref : 
DatasetRef A newly-created
DatasetRefinstance.
Raises: - ValueError
 If a Dataset with the given
DatasetRefalready exists in the given collection.- Exception
 If
dataIdcontains unknown or invalidDataUnitentries.
- datasetType : 
 
- 
addDatasetLocation(ref, datastoreName)¶ Add datastore name locating a given dataset.
Typically used by
Datastore.Parameters: - ref : 
DatasetRef A reference to the dataset for which to add storage information.
- datastoreName : 
str Name of the datastore holding this dataset.
- ref : 
 
- 
addExecution(execution)¶ Add a new
Executionto theSqlRegistry.If
execution.idisNonetheSqlRegistrywill update it to that of the newly inserted entry.Parameters: - execution : 
Execution Instance to add to the
SqlRegistry. The givenExecutionmust not already be present in theSqlRegistry.
Raises: - Exception
 If
Executionis already present in theSqlRegistry.
- execution : 
 
- 
addQuantum(quantum)¶ Add a new
Quantumto theSqlRegistry.Parameters: - quantum : 
Quantum Instance to add to the
SqlRegistry. The givenQuantummust not already be present in theSqlRegistry(or any other), therefore its:runattribute must be set to an existingRun.predictedInputsattribute must be fully populated withDatasetRefs, and its.actualInputsandoutputswill be ignored.
- quantum : 
 
- 
addRun(run)¶ Add a new
Runto theSqlRegistry.Parameters: - run : 
Run Instance to add to the
SqlRegistry. The givenRunmust not already be present in theSqlRegistry(or any other). Therefore itsidmust beNoneand itscollectionmust not be associated with any existingRun.
Raises: - ValueError
 If a run already exists with this collection.
- run : 
 
- 
associate(collection, refs)¶ Add existing Datasets to a collection, possibly creating the collection in the process.
Parameters: - collection : 
str Indicates the collection the Datasets should be associated with.
- refs : 
listofDatasetRef A
listofDatasetRefinstances that already exist in thisSqlRegistry.
- collection : 
 
- 
attachComponent(name, parent, component)¶ Attach a component to a dataset.
Parameters: - name : 
str Name of the component.
- parent : 
DatasetRef A reference to the parent dataset. Will be updated to reference the component.
- component : 
DatasetRef A reference to the component dataset.
- name : 
 
- 
disassociate(collection, refs, remove=True)¶ Remove existing Datasets from a collection.
collectionandrefcombinations that are not currently associated are silently ignored.Parameters: - collection : 
str The collection the Datasets should no longer be associated with.
- refs : 
listofDatasetRef A
listofDatasetRefinstances that already exist in thisSqlRegistry.- remove : 
bool If
True, remove Datasets from theSqlRegistryif they are not associated with any collection (including via any composites).
Returns: - collection : 
 
- 
ensureRun(run)¶ Conditionally add a new
Runto theSqlRegistry.If the
run.idisNoneor aRunwith thisiddoesn’t exist in theRegistryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.Parameters: - run : 
Run Instance to add to the
SqlRegistry.
Raises: - ValueError
 If
runalready exists, but is not identical.
- run : 
 
- 
export(expr)¶ Export contents of the
SqlRegistry, limited to those reachable from the Datasets identified by the expressionexpr, into aTableSetformat such that it can be imported into a different database.Parameters: - expr : 
str An expression (SQL query that evaluates to a list of Dataset primary keys) that selects the
Datasets, or a `QuantumGraphthat can be similarly interpreted.
Returns: - ts : 
TableSet Containing all rows, from all tables in the
SqlRegistrythat are reachable from the selected Datasets.
- expr : 
 
- 
find(collection, datasetType, dataId)¶ Lookup a dataset.
This can be used to obtain a
DatasetRefthat permits the dataset to be read from aDatastore.Parameters: Returns: - ref : 
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
Raises: - ValueError
 If dataId is invalid.
- ref : 
 
- 
findDataUnitEntry(dataUnitName, value)¶ Return a
DataUnitentry corresponding to avalue.Parameters: Returns: 
- 
getDataUnitDefinition(dataUnitName)¶ Return the definition of a DataUnit (an actual
DataUnitobject).Parameters: - dataUnitName : 
str Name of the DataUnit, e.g. “Camera”, “Tract”, etc.
- dataUnitName : 
 
- 
getDataset(id)¶ Retrieve a Dataset entry.
Parameters: - id : 
int The unique identifier for the Dataset.
Returns: - ref : 
DatasetRef A ref to the Dataset, or
Noneif no matching Dataset was found.
- id : 
 
- 
getDatasetLocations(ref)¶ Retrieve datastore locations for a given dataset.
Typically used by
Datastore.Parameters: - ref : 
DatasetRef A reference to the dataset for which to retrieve storage information.
Returns: - ref : 
 
- 
getDatasetType(name)¶ Get the
DatasetType.Parameters: - name : 
str Name of the type.
Returns: - type : 
DatasetType The
DatasetTypeassociated with the given name.
Raises: - KeyError
 Requested named DatasetType could not be found in registry.
- name : 
 
- 
getExecution(id)¶ Retrieve an Execution.
Parameters: - id : 
int The unique identifier for the Execution.
- id : 
 
- 
getRegion(dataId)¶ Get region associated with a dataId.
Parameters: Returns: - region : 
lsst.sphgeom.ConvexPolygon The region associated with a
dataIdorNoneif not present.
Raises: - KeyError
 If the set of dataunits for the
dataIddoes not correspond to a unique spatial lookup.
- region : 
 
- 
getRun(id=None, collection=None)¶ Get a
Runcorresponding to its collection or idParameters: Returns: - run : 
Run The
Runinstance.
Raises: - ValueError
 Must supply one of
collectionorid.
- run : 
 
- 
import_(tables, collection)¶ Import (previously exported) contents into the (possibly empty)
SqlRegistry.Parameters: - ts : 
TableSet Contains the previously exported content.
- collection : 
str An additional collection assigned to the newly imported Datasets.
- ts : 
 
- 
makeDatabaseDict(table, types, key, value)¶ Construct a DatabaseDict backed by a table in the same database as this Registry.
Parameters: - table : 
table Name of the table that backs the returned DatabaseDict. If this table already exists, its schema must include at least everything in
types.- types : 
dict A dictionary mapping
strfield names to type objects, containing all fields to be held in the database.- key : 
str The name of the field to be used as the dictionary key. Must not be present in
value._fields.- value : 
type The type used for the dictionary’s values, typically a
namedtuple. Must have a_fieldsclass attribute that is a tuple of field names (i.e. as defined bynamedtuple); these field names must also appear in thetypesarg, and a_makeattribute to construct it from a sequence of values (again, as defined bynamedtuple).
- table : 
 
- 
makeProvenanceGraph(expr, types=None)¶ Make a
QuantumGraphthat contains the full provenance of all Datasets matching an expression.Parameters: - expr : 
str An expression (SQL query that evaluates to a list of Dataset primary keys) that selects the Datasets.
Returns: - graph : 
QuantumGraph Instance (with
unitsset toNone).
- expr : 
 
- 
makeRun(collection)¶ Create a new
Runin theSqlRegistryand return it.If a run with this collection already exists, return that instead.
Parameters: - collection : 
str The collection used to identify all inputs and outputs of the
Run.
Returns: - run : 
Run A new
Runinstance.
- collection : 
 
- 
markInputUsed(quantum, ref)¶ Record the given
DatasetRefas an actual (not just predicted) input of the givenQuantum.This updates both the
SqlRegistry”sQuantumtable and the PythonQuantum.actualInputsattribute.Parameters: - quantum : 
Quantum Producer to update. Will be updated in this call.
- ref : 
DatasetRef To set as actually used input.
Raises: - KeyError
 If
quantumis not a predicted consumer forref.
- quantum : 
 
- 
merge(outputCollection, inputCollections)¶ Create a new collection from a series of existing ones.
Entries earlier in the list will be used in preference to later entries when both contain Datasets with the same
DatasetRef.Parameters: 
- 
query(sql, **params)¶ Execute a SQL SELECT statement directly.
Named parameters are specified in the SQL query string by preceeding them with a colon. Parameter values are provided as additional keyword arguments. For example:
registry.query(“SELECT * FROM Camera WHERE camera=:name”, name=”HSC”)Parameters: - sql : 
str SQL query string. Must be a SELECT statement.
- **params
 Parameter name-value pairs to insert into the query.
Yields: - row : 
dict The next row result from executing the query.
- sql : 
 
- 
registerDatasetType(datasetType)¶ Add a new
DatasetTypeto the SqlRegistry.It is not an error to register the same
DatasetTypetwice.Parameters: - datasetType : 
DatasetType The
DatasetTypeto be added.
Returns: - inserted : 
bool TrueifdatasetTypewas inserted,Falseif an identical existingDatsetTypewas found.
Raises: - ValueError
 DatasetType is not valid for this registry or is already registered but not identical.
- datasetType : 
 
- 
removeDatasetLocation(datastoreName, ref)¶ Remove datastore location associated with this dataset.
Typically used by
Datastorewhen a dataset is removed.Parameters: - datastoreName : 
str Name of this
Datastore.- ref : 
DatasetRef A reference to the dataset for which information is to be removed.
- datastoreName : 
 
- 
selectDataUnits(collections, expr, neededDatasetTypes, futureDatasetTypes)¶ Evaluate a filter expression and lists of
DatasetTypes and return a set of data unit values.Returned set consists of combinations of units participating in data transformation from
neededDatasetTypestofutureDatasetTypes, restricted by existing data and filter expression.Parameters: - collections : 
listofstr An ordered
listof collections indicating the collections to search for Datasets.- expr : 
str An expression that limits the
DataUnits and (indirectly) the Datasets returned.- neededDatasetTypes : 
listofDatasetType The
listofDatasetTypes whose instances should be included in the graph and limit its extent.- futureDatasetTypes : 
listofDatasetType The
listofDatasetTypes whose instances may be added to the graph later, which requires that theirDataUnittypes must be present in the graph.
Returns: - header : 
tupleoftuple Length of tuple equals the number of columns in the returned result set. Each item is a tuple with two elements - DataUnit name (e.g. “Visit”) and unit value name (e.g. “visit”).
- rows : sequence of 
tuple Result set, this can be a single-pass iterator. Each tuple contains unit values corresponding to units in a header.
- collections : 
 
- 
setDataUnitRegion(dataUnitNames, value, region, update=True)¶ Set the region field for a DataUnit instance or a combination thereof and update associated spatial join tables.
Parameters: - dataUnitNames : sequence
 A sequence of DataUnit names whose instances are jointly associated with a region on the sky. This must not include dependencies that are implied, e.g. “Patch” must not include “Tract”, but “Sensor” needs to add “Visit”.
- value : 
dict A dictionary of values that uniquely identify the DataUnits.
- region : 
sphgeom.ConvexPolygon Region on the sky.
- update : 
bool If True, existing region information for these DataUnits is being replaced. This is usually required because DataUnit entries are assumed to be pre-inserted prior to calling this function.
- 
subset(collection, expr, datasetTypes)¶ Create a new collection by subsetting an existing one.
Parameters: Returns: - collection : 
str The newly created collection.
- collection : 
 
- 
transaction()¶ Context manager that implements SQL transactions.
Will roll back any changes to the
SqlRegistrydatabase in case an exception is raised in the enclosed block.This context manager may be nested.
- registryConfig :