ChainedDatastore¶
- class lsst.daf.butler.datastores.chainedDatastore.ChainedDatastore(config: Union[Config, str], bridgeManager: DatastoreRegistryBridgeManager, butlerRoot: str = None)¶
- Bases: - Datastore- Chained Datastores to allow read and writes from multiple datastores. - A ChainedDatastore is configured with multiple datastore configurations. A - put()is always sent to each datastore. A- get()operation is sent to each datastore in turn and the first datastore to return a valid dataset is used.- Parameters:
- configDatastoreConfigorstr
- Configuration. This configuration must include a - datastoresfield as a sequence of datastore configurations. The order in this sequence indicates the order to use for read operations.
- bridgeManagerDatastoreRegistryBridgeManager
- Object that manages the interface between - Registryand datastores.
- butlerRootstr, optional
- New datastore root to use to override the configuration value. This root is sent to each child datastore. 
 
- config
 - Notes - ChainedDatastore never supports - Noneor- "move"as an- ingesttransfer mode. It supports- "copy",- "symlink",- "relsymlink"and- "hardlink"if and only if all its child datastores do.- Attributes Summary - Key to specify where child datastores are configured. - Path to configuration defaults. - Indicate whether this Datastore is ephemeral or not. - Names associated with this datastore returned as a list. - Methods Summary - emptyTrash([ignore_errors])- Remove all datasets from the trash. - exists(ref)- Check if the dataset exists in one of the datastores. - export(refs, *[, directory, transfer])- Export datasets for transfer to another data repository. - forget(refs)- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - fromConfig(config, bridgeManager[, butlerRoot])- Create datastore from type specified in config file. - get(ref[, parameters])- Load an InMemoryDataset from the store. - Return all the lookup keys relevant to this datastore. - getURI(ref[, predict])- URI to the Dataset. - getURIs(ref[, predict])- Return URIs associated with dataset. - ingest(*datasets[, transfer])- Ingest one or more files into the datastore. - knows(ref)- Check if the dataset is known to any of the datastores. - mexists(refs[, artifact_existence])- Check the existence of multiple datasets at once. - needs_expanded_data_ids(transfer[, entity])- Test whether this datastore needs expanded data IDs to ingest. - put(inMemoryDataset, ref)- Write a InMemoryDataset with a given - DatasetRefto each datastore.- remove(ref)- Indicate to the datastore that a dataset can be removed. - retrieveArtifacts(refs, destination[, ...])- Retrieve the file artifacts associated with the supplied refs. - setConfigRoot(root, config, full[, overwrite])- Set any filesystem-dependent config options for child Datastores to be appropriate for a new empty repository with the given root. - Context manager supporting - Datastoretransactions.- transfer(inputDatastore, ref)- Retrieve a dataset from an input - Datastore, and store the result in this- Datastore.- transfer_from(source_datastore, refs[, ...])- Transfer dataset artifacts from another datastore to this one. - trash(ref[, ignore_errors])- Indicate to the Datastore that a Dataset can be moved to the trash. - validateConfiguration(entities[, logFailures])- Validate some of the configuration for this datastore. - validateKey(lookupKey, entity)- Validate a specific look up key with supplied entity. - Attributes Documentation - containerKey: ClassVar[Optional[str]] = 'datastores'¶
- Key to specify where child datastores are configured. 
 - defaultConfigFile: ClassVar[Optional[str]] = 'datastores/chainedDatastore.yaml'¶
- Path to configuration defaults. Accessed within the - configsresource or relative to a search path. Can be None if no defaults specified.
 - isEphemeral: bool = False¶
- Indicate whether this Datastore is ephemeral or not. An ephemeral datastore is one where the contents of the datastore will not exist across process restarts. This value can change per-instance. 
 - names¶
 - Methods Documentation - emptyTrash(ignore_errors: bool = True) None¶
- Remove all datasets from the trash. - Parameters:
- ignore_errorsbool, optional
- Determine whether errors should be ignored. 
 
- ignore_errors
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
 - exists(ref: DatasetRef) bool¶
- Check if the dataset exists in one of the datastores. 
 - export(refs: Iterable[DatasetRef], *, directory: Optional[str] = None, transfer: Optional[str] = None) Iterable[FileDataset]¶
- Export datasets for transfer to another data repository. - Parameters:
- refsiterable of DatasetRef
- Dataset references to be exported. 
- directorystr, optional
- Path to a directory that should contain files corresponding to output datasets. Ignored if - transferis- None.
- transferstr, optional
- Mode that should be used to move datasets out of the repository. Valid options are the same as those of the - transferargument to- ingest, and datastores may similarly signal that a transfer mode is not supported by raising- NotImplementedError.
 
- refsiterable of 
- Returns:
- datasetiterable of DatasetTransfer
- Structs containing information about the exported datasets, in the same order as - refs.
 
- datasetiterable of 
- Raises:
- NotImplementedError
- Raised if the given transfer mode is not supported. 
 
 
 - forget(refs: Iterable[DatasetRef]) None¶
- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - Parameters:
- refsIterable[DatasetRef]
- References to the datasets being forgotten. 
 
- refs
 - Notes - Asking a datastore to forget a - DatasetRefit does not hold should be a silent no-op, not an error.
 - static fromConfig(config: Config, bridgeManager: DatastoreRegistryBridgeManager, butlerRoot: Optional[Union[str, ButlerURI]] = None) Datastore¶
- Create datastore from type specified in config file. - Parameters:
- configConfig
- Configuration instance. 
- bridgeManagerDatastoreRegistryBridgeManager
- Object that manages the interface between - Registryand datastores.
- butlerRootstr, optional
- Butler root directory. 
 
- config
 
 - get(ref: DatasetRef, parameters: Optional[Mapping[str, Any]] = None) Any¶
- Load an InMemoryDataset from the store. - The dataset is returned from the first datastore that has the dataset. - Parameters:
- refDatasetRef
- Reference to the required Dataset. 
- parametersdict
- StorageClass-specific parameters that specify, for example, a slice of the dataset to be loaded.
 
- ref
- Returns:
- inMemoryDatasetobject
- Requested dataset or slice thereof as an InMemoryDataset. 
 
- inMemoryDataset
- Raises:
- FileNotFoundError
- Requested dataset can not be retrieved. 
- TypeError
- Return value from formatter has unexpected type. 
- ValueError
- Formatter failed to process the dataset. 
 
 
 - getLookupKeys() Set[LookupKey]¶
- Return all the lookup keys relevant to this datastore. - Returns:
- keyssetofLookupKey
- The keys stored internally for looking up information based on - DatasetTypename or- StorageClass.
 
- keys
 
 - getURI(ref: DatasetRef, predict: bool = False) ButlerURI¶
- URI to the Dataset. - The returned URI is from the first datastore in the list that has the dataset with preference given to the first dataset coming from a permanent datastore. If no datastores have the dataset and prediction is allowed, the predicted URI for the first datastore in the list will be returned. - Parameters:
- Returns:
- uriButlerURI
- URI pointing to the dataset within the datastore. If the dataset does not exist in the datastore, and if - predictis- True, the URI will be a prediction and will include a URI fragment “#predicted”.
 
- uri
- Raises:
- FileNotFoundError
- A URI has been requested for a dataset that does not exist and guessing is not allowed. 
- RuntimeError
- Raised if a request is made for a single URI but multiple URIs are associated with this dataset. 
 
 - Notes - If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable. 
 - getURIs(ref: DatasetRef, predict: bool = False) Tuple[Optional[ButlerURI], Dict[str, ButlerURI]]¶
- Return URIs associated with dataset. - Parameters:
- refDatasetRef
- Reference to the required dataset. 
- predictbool, optional
- If the datastore does not know about the dataset, should it return a predicted URI or not? 
 
- ref
- Returns:
 - Notes - The returned URI is from the first datastore in the list that has the dataset with preference given to the first dataset coming from a permanent datastore. If no datastores have the dataset and prediction is allowed, the predicted URI for the first datastore in the list will be returned. 
 - ingest(*datasets: FileDataset, transfer: Optional[str] = None) None¶
- Ingest one or more files into the datastore. - Parameters:
- datasetsFileDataset
- Each positional argument is a struct containing information about a file to be ingested, including its path (either absolute or relative to the datastore root, if applicable), a complete - DatasetRef(with- dataset_id not None), and optionally a formatter class or its fully-qualified string name. If a formatter is not provided, the one the datastore would use for- puton that dataset is assumed.
- transferstr, optional
- How (and whether) the dataset should be added to the datastore. If - None(default), the file must already be in a location appropriate for the datastore (e.g. within its root directory), and will not be modified. Other choices include “move”, “copy”, “link”, “symlink”, “relsymlink”, and “hardlink”. “link” is a special transfer mode that will first try to make a hardlink and if that fails a symlink will be used instead. “relsymlink” creates a relative symlink rather than use an absolute path. Most datastores do not support all transfer modes. “auto” is a special option that will let the data store choose the most natural option for itself.
 
- datasets
- Raises:
- NotImplementedError
- Raised if the datastore does not support the given transfer mode (including the case where ingest is not supported at all). 
- DatasetTypeNotSupportedError
- Raised if one or more files to be ingested have a dataset type that is not supported by the datastore. 
- FileNotFoundError
- Raised if one of the given files does not exist. 
- FileExistsError
- Raised if transfer is not - Nonebut the (internal) location the file would be moved to is already occupied.
 
 - Notes - Subclasses should implement - _prepIngestand- _finishIngestinstead of implementing- ingestdirectly. Datastores that hold and delegate to child datastores may want to call those methods as well.- Subclasses are encouraged to document their supported transfer modes in their class documentation. 
 - knows(ref: DatasetRef) bool¶
- Check if the dataset is known to any of the datastores. - Does not check for existence of any artifact. 
 - mexists(refs: Iterable[DatasetRef], artifact_existence: Optional[Dict[ButlerURI, bool]] = None) Dict[DatasetRef, bool]¶
- Check the existence of multiple datasets at once. - Parameters:
- Returns:
 
 - needs_expanded_data_ids(transfer: Optional[str], entity: Optional[Union[DatasetRef, DatasetType, StorageClass]] = None) bool¶
- Test whether this datastore needs expanded data IDs to ingest. - Parameters:
- Returns:
 
 - put(inMemoryDataset: Any, ref: DatasetRef) None¶
- Write a InMemoryDataset with a given - DatasetRefto each datastore.- The put() to child datastores can fail with - DatasetTypeNotSupportedError. The put() for this datastore will be deemed to have succeeded so long as at least one child datastore accepted the inMemoryDataset.- Parameters:
- inMemoryDatasetobject
- The dataset to store. 
- refDatasetRef
- Reference to the associated Dataset. 
 
- inMemoryDataset
- Raises:
- TypeError
- Supplied object and storage class are inconsistent. 
- DatasetTypeNotSupportedError
- All datastores reported - DatasetTypeNotSupportedError.
 
 
 - remove(ref: DatasetRef) None¶
- Indicate to the datastore that a dataset can be removed. - The dataset will be removed from each datastore. The dataset is not required to exist in every child datastore. - Parameters:
- refDatasetRef
- Reference to the required dataset. 
 
- ref
- Raises:
- FileNotFoundError
- Attempt to remove a dataset that does not exist. Raised if none of the child datastores removed the dataset. 
 
 
 - retrieveArtifacts(refs: Iterable[DatasetRef], destination: ButlerURI, transfer: str = 'auto', preserve_path: bool = True, overwrite: bool = False) List[ButlerURI]¶
- Retrieve the file artifacts associated with the supplied refs. - Parameters:
- refsiterable of DatasetRef
- The datasets for which file artifacts are to be retrieved. A single ref can result in multiple files. The refs must be resolved. 
- destinationButlerURI
- Location to write the file artifacts. 
- transferstr, optional
- Method to use to transfer the artifacts. Must be one of the options supported by - ButlerURI.transfer_from(). “move” is not allowed.
- preserve_pathbool, optional
- If - Truethe full path of the file artifact within the datastore is preserved. If- Falsethe final file component of the path is used.
- overwritebool, optional
- If - Trueallow transfers to overwrite existing files at the destination.
 
- refsiterable of 
- Returns:
- targetslistofButlerURI
- URIs of file artifacts in destination location. Order is not preserved. 
 
- targets
 
 - classmethod setConfigRoot(root: str, config: Config, full: Config, overwrite: bool = True) None¶
- Set any filesystem-dependent config options for child Datastores to be appropriate for a new empty repository with the given root. - Parameters:
- rootstr
- Filesystem path to the root of the data repository. 
- configConfig
- A - Configto update. Only the subset understood by this component will be updated. Will not expand defaults.
- fullConfig
- A complete config with all defaults expanded that can be converted to a - DatastoreConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from- fullto- config.
- overwritebool, optional
- If - False, do not modify a value in- configif the value already exists. Default is always to overwrite with the provided- root.
 
- root
 - Notes - If a keyword is explicitly defined in the supplied - configit will not be overridden by this method if- overwriteis- False. This allows explicit values set in external configs to be retained.
 - transaction() Iterator[DatastoreTransaction]¶
- Context manager supporting - Datastoretransactions.- Transactions can be nested, and are to be used in combination with - Registry.transaction.
 - transfer(inputDatastore: Datastore, ref: DatasetRef) None¶
- Retrieve a dataset from an input - Datastore, and store the result in this- Datastore.- Parameters:
- inputDatastoreDatastore
- The external - Datastorefrom which to retreive the Dataset.
- refDatasetRef
- Reference to the required dataset in the input data store. 
 
- inputDatastore
- Returns:
- resultslist
- List containing the return value from the - put()to each child datastore.
 
- results
 
 - transfer_from(source_datastore: Datastore, refs: Iterable[DatasetRef], local_refs: Optional[Iterable[DatasetRef]] = None, transfer: str = 'auto', artifact_existence: Optional[Dict[ButlerURI, bool]] = None) None¶
- Transfer dataset artifacts from another datastore to this one. - Parameters:
- source_datastoreDatastore
- The datastore from which to transfer artifacts. That datastore must be compatible with this datastore receiving the artifacts. 
- refsiterable of DatasetRef
- The datasets to transfer from the source datastore. 
- local_refsiterable of DatasetRef, optional
- The dataset refs associated with the registry associated with this datastore. Can be - Noneif the source and target datastore are using UUIDs.
- transferstr, optional
- How (and whether) the dataset should be added to the datastore. Choices include “move”, “copy”, “link”, “symlink”, “relsymlink”, and “hardlink”. “link” is a special transfer mode that will first try to make a hardlink and if that fails a symlink will be used instead. “relsymlink” creates a relative symlink rather than use an absolute path. Most datastores do not support all transfer modes. “auto” (the default) is a special option that will let the data store choose the most natural option for itself. If the source location and transfer location are identical the transfer mode will be ignored. 
- artifact_existencedictof [ButlerURI,bool], optional
- Mapping of datastore artifact to existence. Updated by this method with details of all artifacts tested. Can be - Noneif the caller is not interested.
 
- source_datastore
- Raises:
- TypeError
- Raised if the two datastores are not compatible. 
 
 
 - trash(ref: Union[DatasetRef, Iterable[DatasetRef]], ignore_errors: bool = True) None¶
- Indicate to the Datastore that a Dataset can be moved to the trash. - Parameters:
- refDatasetRefor iterable thereof
- Reference(s) to the required Dataset. 
- ignore_errorsbool, optional
- Determine whether errors should be ignored. When multiple refs are being trashed there will be no per-ref check. 
 
- ref
- Raises:
- FileNotFoundError
- When Dataset does not exist and errors are not ignored. Only checked if a single ref is supplied (and not in a list). 
 
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
 - validateConfiguration(entities: Iterable[Union[DatasetRef, DatasetType, StorageClass]], logFailures: bool = False) None¶
- Validate some of the configuration for this datastore. - Parameters:
- Raises:
- DatastoreValidationError
- Raised if there is a validation problem with a configuration. All the problems are reported in a single exception. 
 
 - Notes - This method checks each datastore in turn. 
 - validateKey(lookupKey: LookupKey, entity: Union[DatasetRef, DatasetType, StorageClass]) None¶
- Validate a specific look up key with supplied entity. - Parameters:
- lookupKeyLookupKey
- Key to use to retrieve information from the datastore configuration. 
- entityDatasetRef,DatasetType, orStorageClass
- Entity to compare with configuration retrieved using the specified lookup key. 
 
- lookupKey
- Raises:
- DatastoreValidationError
- Raised if there is a problem with the combination of entity and lookup key. 
 
 - Notes - Bypasses the normal selection priorities by allowing a key that would normally not be selected to be validated.