ChainedDatastore¶
- class lsst.daf.butler.datastores.chainedDatastore.ChainedDatastore(config: DatastoreConfig, bridgeManager: DatastoreRegistryBridgeManager, datastores: list[Datastore])¶
- Bases: - Datastore- Chained Datastores to allow read and writes from multiple datastores. - A ChainedDatastore is configured with multiple datastore configurations. A - put()is always sent to each datastore. A- get()operation is sent to each datastore in turn and the first datastore to return a valid dataset is used.- Parameters:
- configDatastoreConfigorstr
- Configuration. This configuration must include a - datastoresfield as a sequence of datastore configurations. The order in this sequence indicates the order to use for read operations.
- bridgeManagerDatastoreRegistryBridgeManager
- Object that manages the interface between - Registryand datastores.
- datastoreslist[Datastore]
- All the child datastores known to this datastore. 
 
- config
 - Notes - ChainedDatastore never supports - Noneor- "move"as an- ingesttransfer mode. It supports- "copy",- "symlink",- "relsymlink"and- "hardlink"if and only if all its child datastores do.- Attributes Summary - Key to specify where child datastores are configured. - Path to configuration defaults. - Names associated with this datastore returned as a list. - Return the root URIs for each named datastore. - Methods Summary - clone(bridgeManager)- Make an independent copy of this Datastore with a different - DatastoreRegistryBridgeManagerinstance.- emptyTrash([ignore_errors])- Remove all datasets from the trash. - exists(ref)- Check if the dataset exists in one of the datastores. - export(refs, *[, directory, transfer])- Export datasets for transfer to another data repository. - export_records(refs)- Export datastore records and locations to an in-memory data structure. - forget(refs)- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - get(ref[, parameters, storageClass])- Load an InMemoryDataset from the store. - Return all the lookup keys relevant to this datastore. - getManyURIs(refs[, predict, allow_missing])- Return URIs associated with many datasets. - getURI(ref[, predict])- URI to the Dataset. - getURIs(ref[, predict])- Return URIs associated with dataset. - Make definitions of the opaque tables used by this Datastore. - import_records(data)- Import datastore location and record data from an in-memory data structure. - knows(ref)- Check if the dataset is known to any of the datastores. - knows_these(refs)- Check which of the given datasets are known to this datastore. - mexists(refs[, artifact_existence])- Check the existence of multiple datasets at once. - needs_expanded_data_ids(transfer[, entity])- Test whether this datastore needs expanded data IDs to ingest. - Retrieve serializable data that can be used to execute a - get().- put(inMemoryDataset, ref)- Write a InMemoryDataset with a given - DatasetRefto each datastore.- put_new(in_memory_dataset, ref)- Write a - InMemoryDatasetwith a given- DatasetRefto the store.- remove(ref)- Indicate to the datastore that a dataset can be removed. - retrieveArtifacts(refs, destination[, ...])- Retrieve the file artifacts associated with the supplied refs. - setConfigRoot(root, config, full[, overwrite])- Set any filesystem-dependent config options for child Datastores to be appropriate for a new empty repository with the given root. - transfer(inputDatastore, ref)- Retrieve a dataset from an input - Datastore, and store the result in this- Datastore.- transfer_from(source_datastore, refs[, ...])- Transfer dataset artifacts from another datastore to this one. - trash(ref[, ignore_errors])- Indicate to the Datastore that a Dataset can be moved to the trash. - validateConfiguration(entities[, logFailures])- Validate some of the configuration for this datastore. - validateKey(lookupKey, entity)- Validate a specific look up key with supplied entity. - Attributes Documentation - containerKey: ClassVar[str | None] = 'datastores'¶
- Key to specify where child datastores are configured. 
 - defaultConfigFile: ClassVar[str | None] = 'datastores/chainedDatastore.yaml'¶
- Path to configuration defaults. Accessed within the - configsresource or relative to a search path. Can be None if no defaults specified.
 - names¶
 - roots¶
 - Methods Documentation - clone(bridgeManager: DatastoreRegistryBridgeManager) Datastore¶
- Make an independent copy of this Datastore with a different - DatastoreRegistryBridgeManagerinstance.- Parameters:
- bridgeManagerDatastoreRegistryBridgeManager
- New - DatastoreRegistryBridgeManagerobject to use when instantiating managers.
 
- bridgeManager
- Returns:
- datastoreDatastore
- New - Datastoreinstance with the same configuration as the existing instance.
 
- datastore
 
 - emptyTrash(ignore_errors: bool = True) None¶
- Remove all datasets from the trash. - Parameters:
- ignore_errorsbool, optional
- Determine whether errors should be ignored. 
 
- ignore_errors
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
 - exists(ref: DatasetRef) bool¶
- Check if the dataset exists in one of the datastores. 
 - export(refs: Iterable[DatasetRef], *, directory: ResourcePathExpression | None = None, transfer: str | None = 'auto') Iterable[FileDataset]¶
- Export datasets for transfer to another data repository. - Parameters:
- refsiterable of DatasetRef
- Dataset references to be exported. 
- directorystr, optional
- Path to a directory that should contain files corresponding to output datasets. Ignored if - transferis explicitly- None.
- transferstr, optional
- Mode that should be used to move datasets out of the repository. Valid options are the same as those of the - transferargument to- ingest, and datastores may similarly signal that a transfer mode is not supported by raising- NotImplementedError. If “auto” is given and no- directoryis specified,- Nonewill be implied.
 
- refsiterable of 
- Returns:
- datasetiterable of DatasetTransfer
- Structs containing information about the exported datasets, in the same order as - refs.
 
- datasetiterable of 
- Raises:
- NotImplementedError
- Raised if the given transfer mode is not supported. 
 
 
 - export_records(refs: Iterable[DatasetIdRef]) Mapping[str, DatastoreRecordData]¶
- Export datastore records and locations to an in-memory data structure. 
 - forget(refs: Iterable[DatasetRef]) None¶
- Indicate to the Datastore that it should remove all records of the given datasets, without actually deleting them. - Parameters:
- refsIterable[DatasetRef]
- References to the datasets being forgotten. 
 
- refs
 - Notes - Asking a datastore to forget a - DatasetRefit does not hold should be a silent no-op, not an error.
 - get(ref: DatasetRef, parameters: Mapping[str, Any] | None = None, storageClass: StorageClass | str | None = None) Any¶
- Load an InMemoryDataset from the store. - The dataset is returned from the first datastore that has the dataset. - Parameters:
- refDatasetRef
- Reference to the required Dataset. 
- parametersdict
- StorageClass-specific parameters that specify, for example, a slice of the dataset to be loaded.
- storageClassStorageClassorstr, optional
- The storage class to be used to override the Python type returned by this method. By default the returned type matches the dataset type definition for this dataset. Specifying a read - StorageClasscan force a different type to be returned. This type must be compatible with the original type.
 
- ref
- Returns:
- inMemoryDatasetobject
- Requested dataset or slice thereof as an InMemoryDataset. 
 
- inMemoryDataset
- Raises:
- FileNotFoundError
- Requested dataset can not be retrieved. 
- TypeError
- Return value from formatter has unexpected type. 
- ValueError
- Formatter failed to process the dataset. 
 
 
 - getLookupKeys() set[LookupKey]¶
- Return all the lookup keys relevant to this datastore. - Returns:
- keyssetofLookupKey
- The keys stored internally for looking up information based on - DatasetTypename or- StorageClass.
 
- keys
 
 - getManyURIs(refs: Iterable[DatasetRef], predict: bool = False, allow_missing: bool = False) dict[lsst.daf.butler._dataset_ref.DatasetRef, lsst.daf.butler.datastore._datastore.DatasetRefURIs]¶
- Return URIs associated with many datasets. - Parameters:
- Returns:
- URIsdictof [DatasetRef,DatasetRefUris]
- A dict of primary and component URIs, indexed by the passed-in refs. 
 
- URIs
- Raises:
- FileNotFoundError
- A URI has been requested for a dataset that does not exist and guessing is not allowed. 
 
 - Notes - In file-based datastores, getManyURIs does not check that the file is really there, it’s assuming it is if datastore is aware of the file then it actually exists. 
 - getURI(ref: DatasetRef, predict: bool = False) ResourcePath¶
- URI to the Dataset. - The returned URI is from the first datastore in the list that has the dataset with preference given to the first dataset coming from a permanent datastore. If no datastores have the dataset and prediction is allowed, the predicted URI for the first datastore in the list will be returned. - Parameters:
- Returns:
- urilsst.resources.ResourcePath
- URI pointing to the dataset within the datastore. If the dataset does not exist in the datastore, and if - predictis- True, the URI will be a prediction and will include a URI fragment “#predicted”.
 
- uri
- Raises:
- FileNotFoundError
- A URI has been requested for a dataset that does not exist and guessing is not allowed. 
- RuntimeError
- Raised if a request is made for a single URI but multiple URIs are associated with this dataset. 
 
 - Notes - If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable. 
 - getURIs(ref: DatasetRef, predict: bool = False) DatasetRefURIs¶
- Return URIs associated with dataset. - Parameters:
- refDatasetRef
- Reference to the required dataset. 
- predictbool, optional
- If the datastore does not know about the dataset, controls whether it should return a predicted URI or not. 
 
- ref
- Returns:
- urisDatasetRefURIs
- The URI to the primary artifact associated with this dataset (if the dataset was disassembled within the datastore this may be - None), and the URIs to any components associated with the dataset artifact. (can be empty if there are no components).
 
- uris
 - Notes - The returned URI is from the first datastore in the list that has the dataset with preference given to the first dataset coming from a permanent datastore. If no datastores have the dataset and prediction is allowed, the predicted URI for the first datastore in the list will be returned. 
 - get_opaque_table_definitions() Mapping[str, DatastoreOpaqueTable]¶
- Make definitions of the opaque tables used by this Datastore. - Returns:
- tablesMapping[str,ddl.TableSpec]
- Mapping of opaque table names to their definitions. This can be an empty mapping if Datastore does not use opaque tables to keep datastore records. 
 
- tables
 
 - import_records(data: Mapping[str, DatastoreRecordData]) None¶
- Import datastore location and record data from an in-memory data structure. - Parameters:
 - Notes - Implementations should generally not check that any external resources (e.g. files) referred to by these records actually exist, for performance reasons; we expect higher-level code to guarantee that they do. - Implementations are responsible for calling - DatastoreRegistryBridge.inserton all datasets in- data.locationswhere the key is in- names, as well as loading any opaque table data.- Implementations may assume that datasets are either fully present or not at all (single-component exports are not permitted). 
 - knows(ref: DatasetRef) bool¶
- Check if the dataset is known to any of the datastores. - Does not check for existence of any artifact. 
 - knows_these(refs: Iterable[DatasetRef]) dict[lsst.daf.butler._dataset_ref.DatasetRef, bool]¶
- Check which of the given datasets are known to this datastore. - This is like - mexist()but does not check that the file exists.- Parameters:
- refsiterable DatasetRef
- The datasets to check. 
 
- refsiterable 
- Returns:
- existsdict`[`DatasetRef,bool]
- Mapping of dataset to boolean indicating whether the dataset is known to the datastore. 
 
- exists
 
 - mexists(refs: Iterable[DatasetRef], artifact_existence: dict[lsst.resources._resourcePath.ResourcePath, bool] | None = None) dict[lsst.daf.butler._dataset_ref.DatasetRef, bool]¶
- Check the existence of multiple datasets at once. - Parameters:
- refsiterable of DatasetRef
- The datasets to be checked. 
- artifact_existencedict[lsst.resources.ResourcePath,bool]
- Optional mapping of datastore artifact to existence. Updated by this method with details of all artifacts tested. Can be - Noneif the caller is not interested.
 
- refsiterable of 
- Returns:
 
 - needs_expanded_data_ids(transfer: str | None, entity: DatasetRef | DatasetType | StorageClass | None = None) bool¶
- Test whether this datastore needs expanded data IDs to ingest. - Parameters:
- Returns:
 
 - prepare_get_for_external_client(ref: DatasetRef) object | None¶
- Retrieve serializable data that can be used to execute a - get().- Parameters:
- refDatasetRef
- Reference to the required dataset. 
 
- ref
- Returns:
 
 - put(inMemoryDataset: Any, ref: DatasetRef) None¶
- Write a InMemoryDataset with a given - DatasetRefto each datastore.- The put() to child datastores can fail with - DatasetTypeNotSupportedError. The put() for this datastore will be deemed to have succeeded so long as at least one child datastore accepted the inMemoryDataset.- Parameters:
- inMemoryDatasetobject
- The dataset to store. 
- refDatasetRef
- Reference to the associated Dataset. 
 
- inMemoryDataset
- Raises:
- TypeError
- Supplied object and storage class are inconsistent. 
- DatasetTypeNotSupportedError
- All datastores reported - DatasetTypeNotSupportedError.
 
 
 - put_new(in_memory_dataset: Any, ref: DatasetRef) Mapping[str, DatasetRef]¶
- Write a - InMemoryDatasetwith a given- DatasetRefto the store.- Parameters:
- in_memory_datasetobject
- The Dataset to store. 
- refDatasetRef
- Reference to the associated Dataset. 
 
- in_memory_dataset
- Returns:
 
 - remove(ref: DatasetRef) None¶
- Indicate to the datastore that a dataset can be removed. - The dataset will be removed from each datastore. The dataset is not required to exist in every child datastore. - Parameters:
- refDatasetRef
- Reference to the required dataset. 
 
- ref
- Raises:
- FileNotFoundError
- Attempt to remove a dataset that does not exist. Raised if none of the child datastores removed the dataset. 
 
 
 - retrieveArtifacts(refs: Iterable[DatasetRef], destination: ResourcePath, transfer: str = 'auto', preserve_path: bool = True, overwrite: bool = False) list[lsst.resources._resourcePath.ResourcePath]¶
- Retrieve the file artifacts associated with the supplied refs. - Parameters:
- refsiterable of DatasetRef
- The datasets for which file artifacts are to be retrieved. A single ref can result in multiple files. The refs must be resolved. 
- destinationlsst.resources.ResourcePath
- Location to write the file artifacts. 
- transferstr, optional
- Method to use to transfer the artifacts. Must be one of the options supported by - lsst.resources.ResourcePath.transfer_from(). “move” is not allowed.
- preserve_pathbool, optional
- If - Truethe full path of the file artifact within the datastore is preserved. If- Falsethe final file component of the path is used.
- overwritebool, optional
- If - Trueallow transfers to overwrite existing files at the destination.
 
- refsiterable of 
- Returns:
- targetslistoflsst.resources.ResourcePath
- URIs of file artifacts in destination location. Order is not preserved. 
 
- targets
 
 - classmethod setConfigRoot(root: str, config: Config, full: Config, overwrite: bool = True) None¶
- Set any filesystem-dependent config options for child Datastores to be appropriate for a new empty repository with the given root. - Parameters:
- rootstr
- Filesystem path to the root of the data repository. 
- configConfig
- A - Configto update. Only the subset understood by this component will be updated. Will not expand defaults.
- fullConfig
- A complete config with all defaults expanded that can be converted to a - DatastoreConfig. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from- fullto- config.
- overwritebool, optional
- If - False, do not modify a value in- configif the value already exists. Default is always to overwrite with the provided- root.
 
- root
 - Notes - If a keyword is explicitly defined in the supplied - configit will not be overridden by this method if- overwriteis- False. This allows explicit values set in external configs to be retained.
 - transfer(inputDatastore: Datastore, ref: DatasetRef) None¶
- Retrieve a dataset from an input - Datastore, and store the result in this- Datastore.- Parameters:
- inputDatastoreDatastore
- The external - Datastorefrom which to retreive the Dataset.
- refDatasetRef
- Reference to the required dataset in the input data store. 
 
- inputDatastore
- Returns:
- resultslist
- List containing the return value from the - put()to each child datastore.
 
- results
 
 - transfer_from(source_datastore: Datastore, refs: Collection[DatasetRef], transfer: str = 'auto', artifact_existence: dict[lsst.resources._resourcePath.ResourcePath, bool] | None = None, dry_run: bool = False) tuple[set[lsst.daf.butler._dataset_ref.DatasetRef], set[lsst.daf.butler._dataset_ref.DatasetRef]]¶
- Transfer dataset artifacts from another datastore to this one. - Parameters:
- source_datastoreDatastore
- The datastore from which to transfer artifacts. That datastore must be compatible with this datastore receiving the artifacts. 
- refsCollectionofDatasetRef
- The datasets to transfer from the source datastore. 
- transferstr, optional
- How (and whether) the dataset should be added to the datastore. Choices include “move”, “copy”, “link”, “symlink”, “relsymlink”, and “hardlink”. “link” is a special transfer mode that will first try to make a hardlink and if that fails a symlink will be used instead. “relsymlink” creates a relative symlink rather than use an absolute path. Most datastores do not support all transfer modes. “auto” (the default) is a special option that will let the data store choose the most natural option for itself. If the source location and transfer location are identical the transfer mode will be ignored. 
- artifact_existencedict[lsst.resources.ResourcePath,bool]
- Optional mapping of datastore artifact to existence. Updated by this method with details of all artifacts tested. Can be - Noneif the caller is not interested.
- dry_runbool, optional
- Process the supplied source refs without updating the target datastore. 
 
- source_datastore
- Returns:
- Raises:
- TypeError
- Raised if the two datastores are not compatible. 
 
 
 - trash(ref: DatasetRef | Iterable[DatasetRef], ignore_errors: bool = True) None¶
- Indicate to the Datastore that a Dataset can be moved to the trash. - Parameters:
- refDatasetRefor iterable thereof
- Reference(s) to the required Dataset. 
- ignore_errorsbool, optional
- Determine whether errors should be ignored. When multiple refs are being trashed there will be no per-ref check. 
 
- ref
- Raises:
- FileNotFoundError
- When Dataset does not exist and errors are not ignored. Only checked if a single ref is supplied (and not in a list). 
 
 - Notes - Some Datastores may implement this method as a silent no-op to disable Dataset deletion through standard interfaces. 
 - validateConfiguration(entities: Iterable[DatasetRef | DatasetType | StorageClass], logFailures: bool = False) None¶
- Validate some of the configuration for this datastore. - Parameters:
- Raises:
- DatastoreValidationError
- Raised if there is a validation problem with a configuration. All the problems are reported in a single exception. 
 
 - Notes - This method checks each datastore in turn. 
 - validateKey(lookupKey: LookupKey, entity: DatasetRef | DatasetType | StorageClass) None¶
- Validate a specific look up key with supplied entity. - Parameters:
- lookupKeyLookupKey
- Key to use to retrieve information from the datastore configuration. 
- entityDatasetRef,DatasetType, orStorageClass
- Entity to compare with configuration retrieved using the specified lookup key. 
 
- lookupKey
- Raises:
- DatastoreValidationError
- Raised if there is a problem with the combination of entity and lookup key. 
 
 - Notes - Bypasses the normal selection priorities by allowing a key that would normally not be selected to be validated.