Datastore¶
- 
class lsst.daf.butler.Datastore(config: Union[Config, str], bridgeManager: DatastoreRegistryBridgeManager, butlerRoot: Optional[ResourcePathExpression] = None)¶
- Bases: - object- Datastore interface. - Parameters: - config : DatastoreConfigorstr
- Load configuration either from an existing config instance or by referring to a configuration file. 
- bridgeManager : DatastoreRegistryBridgeManager
- Object that manages the interface between - Registryand datastores.
- butlerRoot : str, optional
- New datastore root to use to override the configuration value. 
 - Attributes Summary - containerKey- Name of the key containing a list of subconfigurations that also need to be merged with defaults and will likely use different Python datastore classes (but all using DatastoreConfig). - defaultConfigFile- Path to configuration defaults. - isEphemeral- Indicate whether this Datastore is ephemeral or not. - names- Names associated with this datastore returned as a list. - Methods Summary - emptyTrash(ignore_errors)- Remove all datasets from the trash. - exists(datasetRef)- Check if the dataset exists in the datastore. - export(refs, *, directory, transfer)- Export datasets for transfer to another data repository. - forget(refs)- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - fromConfig(config, bridgeManager, butlerRoot)- Create datastore from type specified in config file. - get(datasetRef, parameters, Any] = None)- Load an - InMemoryDatasetfrom the store.- getLookupKeys()- Return all the lookup keys relevant to this datastore. - getURI(datasetRef, predict)- URI to the Dataset. - getURIs(datasetRef, predict)- Return URIs associated with dataset. - ingest(*datasets, transfer)- Ingest one or more files into the datastore. - knows(ref)- Check if the dataset is known to the datastore. - mexists(refs, artifact_existence, bool]] = None)- Check the existence of multiple datasets at once. - needs_expanded_data_ids(transfer, entity, …)- Test whether this datastore needs expanded data IDs to ingest. - put(inMemoryDataset, datasetRef)- Write a - InMemoryDatasetwith a given- DatasetRefto the store.- remove(datasetRef)- Indicate to the Datastore that a Dataset can be removed. - retrieveArtifacts(refs, destination, …)- Retrieve the artifacts associated with the supplied refs. - setConfigRoot(root, config, full, overwrite)- Set filesystem-dependent config options for this datastore. - transaction()- Context manager supporting - Datastoretransactions.- transfer(inputDatastore, datasetRef)- Transfer a dataset from another datastore to this datastore. - transfer_from(source_datastore, refs, …)- Transfer dataset artifacts from another datastore to this one. - trash(ref, Iterable[DatasetRef]], ignore_errors)- Indicate to the Datastore that a Dataset can be moved to the trash. - validateConfiguration(entities, DatasetType, …)- Validate some of the configuration for this datastore. - validateKey(lookupKey, entity, DatasetType, …)- Validate a specific look up key with supplied entity. - Attributes Documentation - 
containerKey= None¶
- Name of the key containing a list of subconfigurations that also need to be merged with defaults and will likely use different Python datastore classes (but all using DatastoreConfig). Assumed to be a list of configurations that can be represented in a DatastoreConfig and containing a “cls” definition. None indicates that no containers are expected in this Datastore. 
 - 
defaultConfigFile= None¶
- Path to configuration defaults. Accessed within the - configresource or relative to a search path. Can be None if no defaults specified.
 - 
isEphemeral= False¶
- Indicate whether this Datastore is ephemeral or not. An ephemeral datastore is one where the contents of the datastore will not exist across process restarts. This value can change per-instance. 
 - 
names¶
- Names associated with this datastore returned as a list. - Can be different to - namefor a chaining datastore.
 - Methods Documentation - 
emptyTrash(ignore_errors: bool = True) → None¶
- Remove all datasets from the trash. - Parameters: - ignore_errors : bool, optional
- Determine whether errors should be ignored. 
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
- ignore_errors : 
 - 
exists(datasetRef: DatasetRef) → bool¶
- Check if the dataset exists in the datastore. - Parameters: - datasetRef : DatasetRef
- Reference to the required dataset. 
 - Returns: 
- datasetRef : 
 - 
export(refs: Iterable[DatasetRef], *, directory: Optional[str] = None, transfer: Optional[str] = None) → Iterable[FileDataset]¶
- Export datasets for transfer to another data repository. - Parameters: - refs : iterable of DatasetRef
- Dataset references to be exported. 
- directory : str, optional
- Path to a directory that should contain files corresponding to output datasets. Ignored if - transferis- None.
- transfer : str, optional
- Mode that should be used to move datasets out of the repository. Valid options are the same as those of the - transferargument to- ingest, and datastores may similarly signal that a transfer mode is not supported by raising- NotImplementedError.
 - Returns: - dataset : iterable of DatasetTransfer
- Structs containing information about the exported datasets, in the same order as - refs.
 - Raises: - NotImplementedError
- Raised if the given transfer mode is not supported. 
 
- refs : iterable of 
 - 
forget(refs: Iterable[DatasetRef]) → None¶
- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - Parameters: - refs : Iterable[DatasetRef]
- References to the datasets being forgotten. 
 - Notes - Asking a datastore to forget a - DatasetRefit does not hold should be a silent no-op, not an error.
- refs : 
 - 
static fromConfig(config: Config, bridgeManager: DatastoreRegistryBridgeManager, butlerRoot: Optional[ResourcePathExpression] = None) → 'Datastore'¶
- Create datastore from type specified in config file. - Parameters: 
 - 
get(datasetRef: DatasetRef, parameters: Mapping[str, Any] = None) → Any¶
- Load an - InMemoryDatasetfrom the store.- Parameters: - datasetRef : DatasetRef
- Reference to the required Dataset. 
- parameters : dict
- StorageClass-specific parameters that specify a slice of the Dataset to be loaded.
 - Returns: - inMemoryDataset : object
- Requested Dataset or slice thereof as an InMemoryDataset. 
 
- datasetRef : 
 - 
getLookupKeys() → Set[LookupKey]¶
- Return all the lookup keys relevant to this datastore. - Returns: - keys : setofLookupKey
- The keys stored internally for looking up information based on - DatasetTypename or- StorageClass.
 
- keys : 
 - 
getURI(datasetRef: DatasetRef, predict: bool = False) → ResourcePath¶
- URI to the Dataset. - Parameters: - datasetRef : DatasetRef
- Reference to the required Dataset. 
- predict : bool
- If - Trueattempt to predict the URI for a dataset if it does not exist in datastore.
 - Returns: - uri : str
- URI string pointing to the Dataset within the datastore. If the Dataset does not exist in the datastore, the URI may be a guess. If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable. 
 - Raises: - FileNotFoundError
- A URI has been requested for a dataset that does not exist and guessing is not allowed. 
 
- datasetRef : 
 - 
getURIs(datasetRef: DatasetRef, predict: bool = False) → Tuple[Optional[ResourcePath], Dict[str, ResourcePath]]¶
- Return URIs associated with dataset. - Parameters: - ref : DatasetRef
- Reference to the required dataset. 
- predict : bool, optional
- If the datastore does not know about the dataset, should it return a predicted URI or not? 
 - Returns: - primary : lsst.resources.ResourcePath
- The URI to the primary artifact associated with this dataset. If the dataset was disassembled within the datastore this may be - None.
- components : dict
- URIs to any components associated with the dataset artifact. Can be empty if there are no components. 
 
- ref : 
 - 
ingest(*datasets, transfer: Optional[str] = None) → None¶
- Ingest one or more files into the datastore. - Parameters: - datasets : FileDataset
- Each positional argument is a struct containing information about a file to be ingested, including its path (either absolute or relative to the datastore root, if applicable), a complete - DatasetRef(with- dataset_id not None), and optionally a formatter class or its fully-qualified string name. If a formatter is not provided, the one the datastore would use for- puton that dataset is assumed.
- transfer : str, optional
- How (and whether) the dataset should be added to the datastore. If - None(default), the file must already be in a location appropriate for the datastore (e.g. within its root directory), and will not be modified. Other choices include “move”, “copy”, “link”, “symlink”, “relsymlink”, and “hardlink”. “link” is a special transfer mode that will first try to make a hardlink and if that fails a symlink will be used instead. “relsymlink” creates a relative symlink rather than use an absolute path. Most datastores do not support all transfer modes. “auto” is a special option that will let the data store choose the most natural option for itself.
 - Raises: - NotImplementedError
- Raised if the datastore does not support the given transfer mode (including the case where ingest is not supported at all). 
- DatasetTypeNotSupportedError
- Raised if one or more files to be ingested have a dataset type that is not supported by the datastore. 
- FileNotFoundError
- Raised if one of the given files does not exist. 
- FileExistsError
- Raised if transfer is not - Nonebut the (internal) location the file would be moved to is already occupied.
 - Notes - Subclasses should implement - _prepIngestand- _finishIngestinstead of implementing- ingestdirectly. Datastores that hold and delegate to child datastores may want to call those methods as well.- Subclasses are encouraged to document their supported transfer modes in their class documentation. 
- datasets : 
 - 
knows(ref: DatasetRef) → bool¶
- Check if the dataset is known to the datastore. - Does not check for existence of any artifact. - Parameters: - ref : DatasetRef
- Reference to the required dataset. 
 - Returns: 
- ref : 
 - 
mexists(refs: Iterable[DatasetRef], artifact_existence: Optional[Dict[ResourcePath, bool]] = None) → Dict[DatasetRef, bool]¶
- Check the existence of multiple datasets at once. - Parameters: - refs : iterable of DatasetRef
- The datasets to be checked. 
- artifact_existence : dict[lsst.resources.ResourcePath,bool]
- Optional mapping of datastore artifact to existence. Updated by this method with details of all artifacts tested. Can be - Noneif the caller is not interested.
 - Returns: - existence : dictof [DatasetRef,bool]
- Mapping from dataset to boolean indicating existence. 
 
- refs : iterable of 
 - 
needs_expanded_data_ids(transfer: Optional[str], entity: Optional[Union[DatasetRef, DatasetType, StorageClass]] = None) → bool¶
- Test whether this datastore needs expanded data IDs to ingest. - Parameters: - Returns: 
 - 
put(inMemoryDataset: Any, datasetRef: DatasetRef) → None¶
- Write a - InMemoryDatasetwith a given- DatasetRefto the store.- Parameters: - inMemoryDataset : object
- The Dataset to store. 
- datasetRef : DatasetRef
- Reference to the associated Dataset. 
 
- inMemoryDataset : 
 - 
remove(datasetRef: DatasetRef) → None¶
- Indicate to the Datastore that a Dataset can be removed. - Parameters: - datasetRef : DatasetRef
- Reference to the required Dataset. 
 - Raises: - FileNotFoundError
- When Dataset does not exist. 
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
- datasetRef : 
 - 
retrieveArtifacts(refs: Iterable[DatasetRef], destination: ResourcePath, transfer: str = 'auto', preserve_path: bool = True, overwrite: bool = False) → List[ResourcePath]¶
- Retrieve the artifacts associated with the supplied refs. - Parameters: - refs : iterable of DatasetRef
- The datasets for which artifacts are to be retrieved. A single ref can result in multiple artifacts. The refs must be resolved. 
- destination : lsst.resources.ResourcePath
- Location to write the artifacts. 
- transfer : str, optional
- Method to use to transfer the artifacts. Must be one of the options supported by - lsst.resources.ResourcePath.transfer_from(). “move” is not allowed.
- preserve_path : bool, optional
- If - Truethe full path of the artifact within the datastore is preserved. If- Falsethe final file component of the path is used.
- overwrite : bool, optional
- If - Trueallow transfers to overwrite existing files at the destination.
 - Returns: - targets : listoflsst.resources.ResourcePath
- URIs of file artifacts in destination location. Order is not preserved. 
 - Notes - For non-file datastores the artifacts written to the destination may not match the representation inside the datastore. For example a hierarchichal data structure in a NoSQL database may well be stored as a JSON file. 
- refs : iterable of 
 - 
classmethod setConfigRoot(root: str, config: lsst.daf.butler.core.config.Config, full: lsst.daf.butler.core.config.Config, overwrite: bool = True) → None¶
- Set filesystem-dependent config options for this datastore. - The options will be appropriate for a new empty repository with the given root. - Parameters: - root : str
- Filesystem path to the root of the data repository. 
- config : Config
- A - Configto update. Only the subset understood by this component will be updated. Will not expand defaults.
- full : Config
- A complete config with all defaults expanded that can be converted to a - DatastoreConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from- fullto- config.
- overwrite : bool, optional
- If - False, do not modify a value in- configif the value already exists. Default is always to overwrite with the provided- root.
 - Notes - If a keyword is explicitly defined in the supplied - configit will not be overridden by this method if- overwriteis- False. This allows explicit values set in external configs to be retained.
- root : 
 - 
transaction() → Iterator[lsst.daf.butler.core.datastore.DatastoreTransaction]¶
- Context manager supporting - Datastoretransactions.- Transactions can be nested, and are to be used in combination with - Registry.transaction.
 - 
transfer(inputDatastore: Datastore, datasetRef: DatasetRef) → None¶
- Transfer a dataset from another datastore to this datastore. - Parameters: - inputDatastore : Datastore
- The external - Datastorefrom which to retrieve the Dataset.
- datasetRef : DatasetRef
- Reference to the required Dataset. 
 
- inputDatastore : 
 - 
transfer_from(source_datastore: Datastore, refs: Iterable[DatasetRef], local_refs: Optional[Iterable[DatasetRef]] = None, transfer: str = 'auto', artifact_existence: Optional[Dict[ResourcePath, bool]] = None) → None¶
- Transfer dataset artifacts from another datastore to this one. - Parameters: - source_datastore : Datastore
- The datastore from which to transfer artifacts. That datastore must be compatible with this datastore receiving the artifacts. 
- refs : iterable of DatasetRef
- The datasets to transfer from the source datastore. 
- local_refs : iterable of DatasetRef, optional
- The dataset refs associated with the registry associated with this datastore. Can be - Noneif the source and target datastore are using UUIDs.
- transfer : str, optional
- How (and whether) the dataset should be added to the datastore. Choices include “move”, “copy”, “link”, “symlink”, “relsymlink”, and “hardlink”. “link” is a special transfer mode that will first try to make a hardlink and if that fails a symlink will be used instead. “relsymlink” creates a relative symlink rather than use an absolute path. Most datastores do not support all transfer modes. “auto” (the default) is a special option that will let the data store choose the most natural option for itself. If the source location and transfer location are identical the transfer mode will be ignored. 
- artifact_existence : dict[lsst.resources.ResourcePath,bool]
- Optional mapping of datastore artifact to existence. Updated by this method with details of all artifacts tested. Can be - Noneif the caller is not interested.
 - Raises: - TypeError
- Raised if the two datastores are not compatible. 
 
- source_datastore : 
 - 
trash(ref: Union[DatasetRef, Iterable[DatasetRef]], ignore_errors: bool = True) → None¶
- Indicate to the Datastore that a Dataset can be moved to the trash. - Parameters: - ref : DatasetRefor iterable thereof
- Reference(s) to the required Dataset. 
- ignore_errors : bool, optional
- Determine whether errors should be ignored. When multiple refs are being trashed there will be no per-ref check. 
 - Raises: - FileNotFoundError
- When Dataset does not exist and errors are not ignored. Only checked if a single ref is supplied (and not in a list). 
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
- ref : 
 - 
validateConfiguration(entities: Iterable[Union[DatasetRef, DatasetType, StorageClass]], logFailures: bool = False) → None¶
- Validate some of the configuration for this datastore. - Parameters: - entities : iterable of DatasetRef,DatasetType, orStorageClass
- Entities to test against this configuration. Can be differing types. 
- logFailures : bool, optional
- If - True, output a log message for every validation error detected.
 - Raises: - DatastoreValidationError
- Raised if there is a validation problem with a configuration. 
 - Notes - Which parts of the configuration are validated is at the discretion of each Datastore implementation. 
- entities : iterable of 
 - 
validateKey(lookupKey: LookupKey, entity: Union[DatasetRef, DatasetType, StorageClass]) → None¶
- Validate a specific look up key with supplied entity. - Parameters: - lookupKey : LookupKey
- Key to use to retrieve information from the datastore configuration. 
- entity : DatasetRef,DatasetType, orStorageClass
- Entity to compare with configuration retrieved using the specified lookup key. 
 - Raises: - DatastoreValidationError
- Raised if there is a problem with the combination of entity and lookup key. 
 - Notes - Bypasses the normal selection priorities by allowing a key that would normally not be selected to be validated. 
- lookupKey : 
 
- config :