SqlRegistry¶
- 
class lsst.daf.butler.registries.sqlRegistry.SqlRegistry(registryConfig, schemaConfig, dimensionConfig, create=False, butlerRoot=None)¶
- Bases: - lsst.daf.butler.Registry- Registry backed by a SQL database. - Parameters: - registryConfig : SqlRegistryConfigorstr
- Load configuration 
- schemaConfig : SchemaConfigorstr
- Definition of the schema to use. 
- dimensionConfig : DimensionConfigorConfigor
- DimensionGraphconfiguration.
- create : bool
- Assume registry is empty and create a new one. 
 - Attributes Summary - defaultConfigFile- Path to configuration defaults. - limited- If True, this Registry does not maintain Dimension metadata or relationships ( - bool).- pixelization- Object that interprets skypix Dimension values ( - lsst.sphgeom.Pixelization).- Methods Summary - addDataset(datasetType, dataId, run[, …])- Adds a Dataset entry to the - Registry- addDatasetLocation(ref, datastoreName)- Add datastore name locating a given dataset. - addDimensionEntry(dimension[, dataId, entry])- Add a new - Dimensionentry.- addDimensionEntryList(dimension, dataIdList)- Add a new - Dimensionentry.- addExecution(execution)- Add a new - Executionto the- Registry.- addQuantum(quantum)- Add a new - Quantumto the- Registry.- addRun(run)- Add a new - Runto the- Registry.- associate(collection, refs)- Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. - attachComponent(name, parent, component)- Attach a component to a dataset. - disassociate(collection, refs)- Remove existing Datasets from a collection. - ensureRun(run)- Conditionally add a new - Runto the- Registry.- expandDataId([dataId, dimension, metadata, …])- Expand a data ID to include additional information. - find(collection, datasetType[, dataId])- Lookup a dataset. - findDimensionEntries(dimension)- Return all - Dimensionentries corresponding to the named dimension.- findDimensionEntry(dimension[, dataId])- Return a - Dimensionentry corresponding to a- DataId.- fromConfig(registryConfig[, schemaConfig, …])- Create - Registrysubclass instance from- config.- getAllCollections()- Get names of all the collections found in this repository. - getAllDatasetTypes()- Get every registered - DatasetType.- getDataset(id[, datasetType, dataId])- Retrieve a Dataset entry. - getDatasetLocations(ref)- Retrieve datastore locations for a given dataset. - getDatasetType(name)- Get the - DatasetType.- getExecution(id)- Retrieve an Execution. - getQuantum(id)- Retrieve an Quantum. - getRun([id, collection])- Get a - Runcorresponding to its collection or id- makeDataIdPacker(name[, dataId])- Create an object that can pack certain data IDs into integers. - makeDatabaseDict(table, types, key, value[, …])- Construct a DatabaseDict backed by a table in the same database as this Registry. - makeRun(collection)- Create a new - Runin the- Registryand return it.- markInputUsed(quantum, ref)- Record the given - DatasetRefas an actual (not just predicted) input of the given- Quantum.- packDataId(name[, dataId, returnMaxBits])- Pack the given - DataIdinto an integer.- query(sql, **params)- Execute a SQL SELECT statement directly. - registerDatasetType(datasetType)- Add a new - DatasetTypeto the Registry.- removeDataset(ref)- Remove a dataset from the Registry. - removeDatasetLocation(datastoreName, ref)- Remove datastore location associated with this dataset. - selectMultipleDatasetTypes(originInfo[, …])- Evaluate a filter expression and lists of - DatasetTypesand return a set of dimension values.- setConfigRoot(root, config, full[, overwrite])- Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. - setDimensionRegion([dataId, update, region])- Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables. - transaction()- Context manager that implements SQL transactions. - Attributes Documentation - 
defaultConfigFile= None¶
- Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified. 
 - 
pixelization¶
- Object that interprets skypix Dimension values ( - lsst.sphgeom.Pixelization).- Nonefor limited registries.
 - Methods Documentation - 
addDataset(datasetType, dataId, run, producer=None, recursive=False, **kwds)¶
- Adds a Dataset entry to the - Registry- This always adds a new Dataset; to associate an existing Dataset with a new collection, use - associate.- Parameters: - datasetType : DatasetTypeorstr
- A - DatasetTypeor the name of one.
- dataId : dictorDataId
- A - dict-like object containing the- Dimensionlinks that identify the dataset within a collection.
- run : Run
- The - Runinstance that produced the Dataset. Ignored if- produceris passed (- producer.runis then used instead). A Run must be provided by one of the two arguments.
- producer : Quantum
- Unit of work that produced the Dataset. May be - Noneto store no provenance information, but if present the- Quantummust already have been added to the Registry.
- recursive : bool
- If True, recursively add Dataset and attach entries for component Datasets as well. 
- kwds
- Additional keyword arguments passed to the - DataIdconstructor to convert- dataIdto a true- DataIdor augment an existing one.
 - Returns: - ref : DatasetRef
- A newly-created - DatasetRefinstance.
 - Raises: - ConflictingDefinitionError
- If a Dataset with the given - DatasetRefalready exists in the given collection.
- Exception
- If - dataIdcontains unknown or invalid- Dimensionentries.
 
- datasetType : 
 - 
addDatasetLocation(ref, datastoreName)¶
- Add datastore name locating a given dataset. - Typically used by - Datastore.- Parameters: - ref : DatasetRef
- A reference to the dataset for which to add storage information. 
- datastoreName : str
- Name of the datastore holding this dataset. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- ref : 
 - 
addDimensionEntry(dimension, dataId=None, entry=None, **kwds)¶
- Add a new - Dimensionentry.- dimension : strorDimension
- Either a Dimensionobject or the name of one.
- dataId : dictorDataId, optional
- A dict-like object containing theDimensionlinks that form the primary key of the row to insert. If this is a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry.
- entry : dict
- Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
 - If - valuesincludes a “region” key,- setDimensionRegionwill automatically be called to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.- Returns: - dataId : DataId
- A Data ID for exactly the given dimension that includes the added entry. 
 - Raises: 
- dimension : 
 - 
addDimensionEntryList(dimension, dataIdList, entry=None, **kwds)¶
- Add a new - Dimensionentry.- dimension : strorDimension
- Either a Dimensionobject or the name of one.
- dataId : listofdictorDataId
- A list of dict-like objects containing theDimensionlinks that form the primary key of the rows to insert. If these are a fullDataIdobject,dataId.entries[dimension]will be updated withentryand then inserted into theRegistry.
- entry : dict
- Dictionary that maps column name to column value.
- kwds
- Additional keyword arguments passed to the DataIdconstructor to convertdataIdto a trueDataIdor augment an existing one.
 - If - valuesincludes a “region” key, regions will automatically be added to set it any associated spatial join tables. Region fields associated with a combination of Dimensions must be explicitly set separately.- Returns: - dataId : DataId
- A Data ID for exactly the given dimension that includes the added entry. 
 - Raises: 
- dimension : 
 - 
addExecution(execution)¶
- Add a new - Executionto the- Registry.- If - execution.idis- Nonethe- Registrywill update it to that of the newly inserted entry.- Parameters: - execution : Execution
- Instance to add to the - Registry. The given- Executionmust not already be present in the- Registry.
 - Raises: - ConflictingDefinitionError
- If - executionis already present in the- Registry.
 
- execution : 
 - 
addQuantum(quantum)¶
- Add a new - Quantumto the- Registry.- Parameters: - quantum : Quantum
- Instance to add to the - Registry. The given- Quantummust not already be present in the- Registry(or any other), therefore its:- runattribute must be set to an existing- Run.
- predictedInputsattribute must be fully populated with- DatasetRefs, and its.
- actualInputsand- outputswill be ignored.
 
 
- quantum : 
 - 
addRun(run)¶
- Add a new - Runto the- Registry.- Parameters: - Raises: - ConflictingDefinitionError
- If a run already exists with this collection. 
 
 - 
associate(collection, refs)¶
- Add existing Datasets to a collection, implicitly creating the collection if it does not already exist. - If a DatasetRef with the same exact - dataset_idis already in a collection nothing is changed. If a- DatasetRefwith the same- DatasetType1and dimension values but with different- dataset_idexists in the collection,- ValueErroris raised.- Parameters: - collection : str
- Indicates the collection the Datasets should be associated with. 
- refs : iterable of DatasetRef
- An iterable of - DatasetRefinstances that already exist in this- Registry. All component datasets will be associated with the collection as well.
 - Raises: - ConflictingDefinitionError
- If a Dataset with the given - DatasetRefalready exists in the given collection.
 
- collection : 
 - 
attachComponent(name, parent, component)¶
- Attach a component to a dataset. - Parameters: - name : str
- Name of the component. 
- parent : DatasetRef
- A reference to the parent dataset. Will be updated to reference the component. 
- component : DatasetRef
- A reference to the component dataset. 
 - Raises: - AmbiguousDatasetError
- Raised if - parent.idor- component.idis- None.
 
- name : 
 - 
disassociate(collection, refs)¶
- Remove existing Datasets from a collection. - collectionand- refcombinations that are not currently associated are silently ignored.- Parameters: - Raises: - AmbiguousDatasetError
- Raised if - any(ref.id is None for ref in refs).
 
 - 
ensureRun(run)¶
- Conditionally add a new - Runto the- Registry.- If the - run.idis- Noneor a- Runwith this- iddoesn’t exist in the- Registryyet, add it. Otherwise, ensure the provided run is identical to the one already in the registry.- Parameters: - run : Run
- Instance to add to the - Registry.
 - Raises: - ConflictingDefinitionError
- If - runalready exists, but is not identical.
 
- run : 
 - 
expandDataId(dataId=None, *, dimension=None, metadata=None, region=False, update=False, **kwds)¶
- Expand a data ID to include additional information. - expandDataIdalways returns a true- DataIdand ensures that its- entriesdict contains (at least) values for all implied dependencies.- Parameters: - dataId : dictorDataId
- A - dict-like object containing the- Dimensionlinks that include the primary keys of the rows to query. If this is a true- DataId, the object will be updated in-place.
- dimension : Dimensionorstr
- A dimension passed to the - DataIdconstructor to create a true- DataIdor augment an existing one.
- metadata : collections.abc.Mapping, optional
- A mapping from - Dimensionor- strname to column name, indicating fields to read into- dataId.entries. If- dimensionis provided, may instead be a sequence of column names for that dimension.
- region : bool
- If - Trueand the given- DataIdis uniquely associated with a region on the sky, obtain that region from the- Registryand attach it as- dataId.region.
- update : bool
- If - True, assume existing entries and regions in the given- DataIdare out-of-date and should be updated by values in the database. If- False, existing values will be assumed to be correct and database queries will only be executed if they are missing.
- kwds
- Additional keyword arguments passed to the - DataIdconstructor to convert- dataIdto a true- DataIdor augment an existing one.
 - Returns: - dataId : DataId
- A Data ID with all requested data populated. 
 - Raises: 
- dataId : 
 - 
find(collection, datasetType, dataId=None, **kwds)¶
- Lookup a dataset. - This can be used to obtain a - DatasetRefthat permits the dataset to be read from a- Datastore.- Parameters: - collection : str
- Identifies the collection to search. 
- datasetType : DatasetTypeorstr
- A - DatasetTypeor the name of one.
- dataId : dictorDataId, optional
- A - dict-like object containing the- Dimensionlinks that identify the dataset within a collection.
- kwds
- Additional keyword arguments passed to the - DataIdconstructor to convert- dataIdto a true- DataIdor augment an existing one.
 - Returns: - ref : DatasetRef
- A ref to the Dataset, or - Noneif no matching Dataset was found.
 - Raises: - LookupError
- If one or more data ID keys are missing. 
 
- collection : 
 - 
findDimensionEntries(dimension)¶
- Return all - Dimensionentries corresponding to the named dimension.- Parameters: - dimension : strorDimension
- Either a - Dimensionobject or the name of one.
 - Returns: - Raises: 
- dimension : 
 - 
findDimensionEntry(dimension, dataId=None, **kwds)¶
- Return a - Dimensionentry corresponding to a- DataId.- Parameters: - dimension : strorDimension
- Either a - Dimensionobject or the name of one.
- dataId : dictorDataId, optional
- A - dict-like object containing the- Dimensionlinks that form the primary key of the row to retreive. If this is a full- DataIdobject,- dataId.entries[dimension]will be updated with the entry obtained from the- Registry.
- kwds
- Additional keyword arguments passed to the - DataIdconstructor to convert- dataIdto a true- DataIdor augment an existing one.
 - Returns: - Raises: 
- dimension : 
 - 
static fromConfig(registryConfig, schemaConfig=None, dimensionConfig=None, create=False, butlerRoot=None)¶
- Create - Registrysubclass instance from- config.- Uses - registry.clsfrom- configto determine which subclass to instantiate.- Parameters: - registryConfig : ButlerConfig,RegistryConfig,Configorstr
- Registry configuration 
- schemaConfig : SchemaConfig,Configorstr, optional.
- Schema configuration. Can be read from supplied registryConfig if the relevant component is defined and - schemaConfigis- None.
- dimensionConfig : DimensionConfigorConfigor
- str, optional.- DimensionGraphconfiguration. Can be read from supplied registryConfig if the relevant component is defined and- dimensionConfigis- None.
- create : bool
- Assume empty Registry and create a new one. 
 - Returns: - registry : Registry(subclass)
- A new - Registrysubclass instance.
 
- registryConfig : 
 - 
getAllCollections()¶
- Get names of all the collections found in this repository. - Returns: 
 - 
getAllDatasetTypes()¶
- Get every registered - DatasetType.- Returns: - types : frozensetofDatasetType
- Every - DatasetTypein the registry.
 
- types : 
 - 
getDataset(id, datasetType=None, dataId=None)¶
- Retrieve a Dataset entry. - Parameters: - id : int
- The unique identifier for the Dataset. 
- datasetType : DatasetType, optional
- The - DatasetTypeof the dataset to retrieve. This is used to short-circuit retrieving the- DatasetType, so if provided, the caller is guaranteeing that it is what would have been retrieved.
- dataId : DataId, optional
- A - Dimension-based identifier for the dataset within a collection, possibly containing additional metadata. This is used to short-circuit retrieving the- DataId, so if provided, the caller is guaranteeing that it is what would have been retrieved.
 - Returns: - ref : DatasetRef
- A ref to the Dataset, or - Noneif no matching Dataset was found.
 
- id : 
 - 
getDatasetLocations(ref)¶
- Retrieve datastore locations for a given dataset. - Typically used by - Datastore.- Parameters: - ref : DatasetRef
- A reference to the dataset for which to retrieve storage information. 
 - Returns: - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- ref : 
 - 
getDatasetType(name)¶
- Get the - DatasetType.- Parameters: - name : str
- Name of the type. 
 - Returns: - type : DatasetType
- The - DatasetTypeassociated with the given name.
 - Raises: - KeyError
- Requested named DatasetType could not be found in registry. 
 
- name : 
 - 
getExecution(id)¶
- Retrieve an Execution. - Parameters: - id : int
- The unique identifier for the Execution. 
 
- id : 
 - 
getRun(id=None, collection=None)¶
- Get a - Runcorresponding to its collection or id- Parameters: - Returns: - run : Run
- The - Runinstance.
 - Raises: - ValueError
- Must supply one of - collectionor- id.
 
- run : 
 - 
makeDataIdPacker(name, dataId=None, **kwds)¶
- Create an object that can pack certain data IDs into integers. - Parameters: - Returns: - packer : DataIdPacker
- Instance of a subclass of - DataIdPacker.
 
- packer : 
 - 
makeDatabaseDict(table, types, key, value, lengths=None)¶
- Construct a DatabaseDict backed by a table in the same database as this Registry. - Parameters: - table : table
- Name of the table that backs the returned DatabaseDict. If this table already exists, its schema must include at least everything in - types.
- types : dict
- A dictionary mapping - strfield names to type objects, containing all fields to be held in the database.
- key : str
- The name of the field to be used as the dictionary key. Must not be present in - value._fields.
- value : type
- The type used for the dictionary’s values, typically a - namedtuple. Must have a- _fieldsclass attribute that is a tuple of field names (i.e. as defined by- namedtuple); these field names must also appear in the- typesarg, and a- _makeattribute to construct it from a sequence of values (again, as defined by- namedtuple).
- lengths : dict, optional
- Specific lengths of string fields. Defaults will be used if not specified. 
 - Returns: - databaseDict : DatabaseDict
- DatabaseDictbacked by this registry.
 
- table : 
 - 
makeRun(collection)¶
- Create a new - Runin the- Registryand return it.- If a run with this collection already exists, return that instead. - Parameters: - collection : str
- The collection used to identify all inputs and outputs of the - Run.
 - Returns: - run : Run
- A new - Runinstance.
 
- collection : 
 - 
markInputUsed(quantum, ref)¶
- Record the given - DatasetRefas an actual (not just predicted) input of the given- Quantum.- This updates both the - Registry”s- Quantumtable and the Python- Quantum.actualInputsattribute.- Parameters: - quantum : Quantum
- Producer to update. Will be updated in this call. 
- ref : DatasetRef
- To set as actually used input. 
 - Raises: - KeyError
- If - quantumis not a predicted consumer for- ref.
 
- quantum : 
 - 
packDataId(name, dataId=None, *, returnMaxBits=False, **kwds)¶
- Pack the given - DataIdinto an integer.- Parameters: - name : str
- Name of the packer, as given in the - Registryconfiguration.
- dataId : dictorDataId, optional
- Data ID that identifies at least the “required” dimensions of the packer. 
- returnMaxBits : bool
- If - True, return a tuple of- (packed, self.maxBits).
- kwds
- Addition keyword arguments used to augment or override the given data ID. 
 - Returns: 
- name : 
 - 
query(sql, **params)¶
- Execute a SQL SELECT statement directly. - Named parameters are specified in the SQL query string by preceeding them with a colon. Parameter values are provided as additional keyword arguments. For example: - registry.query(“SELECT * FROM instrument WHERE instrument=:name”,
- name=”HSC”)
 - Parameters: - sql : str
- SQL query string. Must be a SELECT statement. 
- **params
- Parameter name-value pairs to insert into the query. 
 - Yields: - row : dict
- The next row result from executing the query. 
 
 - 
registerDatasetType(datasetType)¶
- Add a new - DatasetTypeto the Registry.- It is not an error to register the same - DatasetTypetwice.- Parameters: - datasetType : DatasetType
- The - DatasetTypeto be added.
 - Returns: - Raises: - ValueError
- Raised if the dimensions or storage class are invalid. 
- ConflictingDefinitionError
- Raised if this DatasetType is already registered with a different definition. 
 
- datasetType : 
 - 
removeDataset(ref)¶
- Remove a dataset from the Registry. - The dataset and all components will be removed unconditionally from all collections, and any associated - Quantumrecords will also be removed.- Datastorerecords will not be deleted; the caller is responsible for ensuring that the dataset has already been removed from all Datastores.- Parameters: - ref : DatasetRef
- Reference to the dataset to be removed. Must include a valid - idattribute, and should be considered invalidated upon return.
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
- OrphanedRecordError
- Raised if the dataset is still present in any - Datastore.
 
- ref : 
 - 
removeDatasetLocation(datastoreName, ref)¶
- Remove datastore location associated with this dataset. - Typically used by - Datastorewhen a dataset is removed.- Parameters: - datastoreName : str
- Name of this - Datastore.
- ref : DatasetRef
- A reference to the dataset for which information is to be removed. 
 - Raises: - AmbiguousDatasetError
- Raised if - ref.idis- None.
 
- datastoreName : 
 - 
selectMultipleDatasetTypes(originInfo, expression=None, required=(), optional=(), prerequisite=(), perDatasetTypeDimensions=(), expandDataIds=True)¶
- Evaluate a filter expression and lists of - DatasetTypesand return a set of dimension values.- The returned rows consists of combinations of dimensions participating in the transformation from - requiredto- optionaldataset types, restricted by existing datasets and filter expression.- Parameters: - originInfo : DatasetOriginInfo
- Object which provides names of the input/output collections. 
- expression : str
- An expression that limits the - Dimensionsand (indirectly) the Datasets returned.
- required : iterable of DatasetTypeorstr
- The - listof DatasetTypes whose Dimensions will be included in the returned column set. Output is limited to the the Datasets of these DatasetTypes which already exist in the registry.
- optional : iterable of DatasetTypeorstr
- The - listof DatasetTypes whose Dimensions will be included in the returned column set. Datasets of these types may or may not existin the registry.
- prerequisite : iterable of DatasetTypeorstr
- DatasetTypes that should not constrain the query results, but must be present for all result rows. These are included with a LEFT OUTER JOIN, but the results are checked for NULL. Unlike regular inputs, prerequisite inputs lookups may be deferred (by some - Registryimplementations). Any DatasetTypes that are present in both- requiredand- prerequisiteare considered- prerequisite.
- perDatasetTypeDimensions : iterable of Dimensionorstr, optional
- Dimensions (or - strnames thereof) for which different dataset types do not need to have the same values in each result row.
- expandDataIds : bool
- If - True(default), expand all data IDs when returning them.
 - Yields: - row : MultipleDatasetQueryRow
- Single row is a unique combination of units in a transform. 
 - Raises: 
- originInfo : 
 - 
classmethod setConfigRoot(root, config, full, overwrite=True)¶
- Set any filesystem-dependent config options for this Registry to be appropriate for a new empty repository with the given root. - Parameters: - root : str
- Filesystem path to the root of the data repository. 
- config : Config
- A - Configto update. Only the subset understood by this component will be updated. Will not expand defaults.
- full : Config
- A complete config with all defaults expanded that can be converted to a - RegistryConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from- fullto- config.
- overwrite : bool, optional
- If - False, do not modify a value in- configif the value already exists. Default is always to overwrite with the provided- root.
 - Notes - If a keyword is explicitly defined in the supplied - configit will not be overridden by this method if- overwriteis- False. This allows explicit values set in external configs to be retained.
- root : 
 - 
setDimensionRegion(dataId=None, *, update=True, region=None, **kwds)¶
- Set the region field for a Dimension instance or a combination thereof and update associated spatial join tables. - Parameters: - dataId : dictorDataId
- A - dict-like object containing the- Dimensionlinks that form the primary key of the row to insert or update. If this is a full- DataId,- dataId.regionwill be set to- region(if- regionis not- None) and then used to update or insert into the- Registry.
- update : bool
- If True, existing region information for these Dimensions is being replaced. This is usually required because Dimension entries are assumed to be pre-inserted prior to calling this function. 
- region : lsst.sphgeom.ConvexPolygon, optional
- The region to update or insert into the - Registry. If not present- dataId.regionmust not be- None.
- kwds
- Additional keyword arguments passed to the - DataIdconstructor to convert- dataIdto a true- DataIdor augment an existing one.
 - Returns: - dataId : DataId
- A Data ID with its - regionattribute set.
 - Raises: 
- dataId : 
 - 
transaction()¶
- Context manager that implements SQL transactions. - Will roll back any changes to the - SqlRegistrydatabase in case an exception is raised in the enclosed block.- This context manager may be nested. 
 
- registryConfig :