DatastoreCacheManager¶
- class lsst.daf.butler.DatastoreCacheManager(config: str | DatastoreCacheManagerConfig, universe: DimensionUniverse)¶
- Bases: - AbstractDatastoreCacheManager- A class for managing caching in a Datastore using local files. - Parameters:
- configstrorDatastoreCacheManagerConfig
- Configuration to control caching. 
- universeDimensionUniverse
- Set of all known dimensions, used to expand and validate any used in lookup keys. 
 
- config
 - Notes - Two environment variables can be used to override the cache directory and expiration configuration: - $DAF_BUTLER_CACHE_DIRECTORY
- $DAF_BUTLER_CACHE_EXPIRATION_MODE
 - The expiration mode should take the form - mode=thresholdso for example to configure expiration to limit the cache directory to 5 datasets the value would be- datasets=5.- Additionally the - $DAF_BUTLER_CACHE_DIRECTORY_IF_UNSETenvironment variable can be used to indicate that this directory should be used if no explicit directory has been specified from configuration or from the- $DAF_BUTLER_CACHE_DIRECTORYenvironment variable.- Attributes Summary - Size of the cache in bytes. - Return number of cached files tracked by registry. - Methods Summary - find_in_cache(ref, extension)- Look for a dataset in the cache and return its location. - known_to_cache(ref[, extension])- Report if the dataset is known to the cache. - move_to_cache(uri, ref)- Move a file to the cache. - remove_from_cache(refs)- Remove the specified datasets from the cache. - Scan the cache directory and record information about files. - Define a fallback cache directory if a fallback not set already. - should_be_cached(entity)- Indicate whether the entity should be added to the cache. - Attributes Documentation - cache_directory¶
 - cache_size¶
 - file_count¶
 - Methods Documentation - find_in_cache(ref: DatasetRef, extension: str) Iterator[ResourcePath | None]¶
- Look for a dataset in the cache and return its location. - Parameters:
- refDatasetRef
- Dataset to locate in the cache. 
- extensionstr
- File extension expected. Should include the leading “ - .”.
 
- ref
- Yields:
- urilsst.resources.ResourcePathorNone
- The URI to the cached file, or - Noneif the file has not been cached.
 
- uri
 - Notes - Should be used as a context manager in order to prevent this file from being removed from the cache for that context. 
 - known_to_cache(ref: DatasetRef, extension: str | None = None) bool¶
- Report if the dataset is known to the cache. - Parameters:
- refDatasetRef
- Dataset to check for in the cache. 
- extensionstr, optional
- File extension expected. Should include the leading “ - .”. If- Nonethe extension is ignored and the dataset ID alone is used to check in the cache. The extension must be defined if a specific component is being checked.
 
- ref
- Returns:
 - Notes - This method can only report if the dataset is known to the cache in this specific instant and does not indicate whether the file can be read from the cache later. - find_in_cache()should be called if the cached file is to be used.- This method does not force the cache to be re-scanned and so can miss cached datasets that have recently been written by other processes. 
 - move_to_cache(uri: ResourcePath, ref: DatasetRef) ResourcePath | None¶
- Move a file to the cache. - Move the given file into the cache, using the supplied DatasetRef for naming. A call is made to - should_be_cached()and if the DatasetRef should not be accepted- Nonewill be returned.- Cache expiry can occur during this. - Parameters:
- urilsst.resources.ResourcePath
- Location of the file to be relocated to the cache. Will be moved. 
- refDatasetRef
- Ref associated with this file. Will be used to determine the name of the file within the cache. 
 
- uri
- Returns:
- newlsst.resources.ResourcePathorNone
- URI to the file within the cache, or - Noneif the dataset was not accepted by the cache.
 
- new
 
 - remove_from_cache(refs: DatasetRef | Iterable[DatasetRef]) None¶
- Remove the specified datasets from the cache. - It is not an error for these datasets to be missing from the cache. - Parameters:
- refDatasetRefor iterable ofDatasetRef
- The datasets to remove from the cache. 
 
- ref
 
 - classmethod set_fallback_cache_directory_if_unset() tuple[bool, str]¶
- Define a fallback cache directory if a fallback not set already. - Returns:
- definedbool
- Trueif the fallback directory was newly-defined in this method.- Falseif it had already been set.
- cache_dirstr
- Returns the path to the cache directory that will be used if it’s needed. This can allow the caller to run a directory cleanup when it’s no longer needed (something that the cache manager can not do because forks should not clean up directories defined by the parent process). 
 
- defined
 - Notes - The fallback directory will not be defined if one has already been defined. This method sets the - DAF_BUTLER_CACHE_DIRECTORY_IF_UNSETenvironment variable only if a value has not previously been stored in that environment variable. Setting the environment variable allows this value to survive into spawned subprocesses. Calling this method will lead to all subsequently created cache managers sharing the same cache.
 - should_be_cached(entity: DatasetRef | DatasetType | StorageClass) bool¶
- Indicate whether the entity should be added to the cache. - This is relevant when reading or writing. - Parameters:
- entityStorageClassorDatasetTypeorDatasetRef
- Thing to test against the configuration. The - nameproperty is used to determine a match. A- DatasetTypewill first check its name, before checking its- StorageClass. If there are no matches the default will be returned.
 
- entity
- Returns: