AbstractDatastoreCacheManager

class lsst.daf.butler.AbstractDatastoreCacheManager(config: str | DatastoreCacheManagerConfig, universe: DimensionUniverse)

Bases: ABC

An abstract base class for managing caching in a Datastore.

Parameters:
configstr or DatastoreCacheManagerConfig

Configuration to control caching.

universeDimensionUniverse

Set of all known dimensions, used to expand and validate any used in lookup keys.

Attributes Summary

cache_size

Size of the cache in bytes.

file_count

Return number of cached files tracked by registry.

Methods Summary

find_in_cache(ref, extension)

Look for a dataset in the cache and return its location.

known_to_cache(ref[, extension])

Report if the dataset is known to the cache.

move_to_cache(uri, ref)

Move a file to the cache.

remove_from_cache(ref)

Remove the specified datasets from the cache.

should_be_cached(entity)

Indicate whether the entity should be added to the cache.

Attributes Documentation

cache_size

Size of the cache in bytes.

file_count

Return number of cached files tracked by registry.

Methods Documentation

abstract find_in_cache(ref: DatasetRef, extension: str) Iterator[ResourcePath | None]

Look for a dataset in the cache and return its location.

Parameters:
refDatasetRef

Dataset to locate in the cache.

extensionstr

File extension expected. Should include the leading “.”.

Yields:
urilsst.resources.ResourcePath or None

The URI to the cached file, or None if the file has not been cached.

Notes

Should be used as a context manager in order to prevent this file from being removed from the cache for that context.

abstract known_to_cache(ref: DatasetRef, extension: str | None = None) bool

Report if the dataset is known to the cache.

Parameters:
refDatasetRef

Dataset to check for in the cache.

extensionstr, optional

File extension expected. Should include the leading “.”. If None the extension is ignored and the dataset ID alone is used to check in the cache. The extension must be defined if a specific component is being checked.

Returns:
knownbool

Returns True if the dataset is currently known to the cache and False otherwise.

Notes

This method can only report if the dataset is known to the cache in this specific instant and does not indicate whether the file can be read from the cache later. find_in_cache() should be called if the cached file is to be used.

abstract move_to_cache(uri: ResourcePath, ref: DatasetRef) ResourcePath | None

Move a file to the cache.

Move the given file into the cache, using the supplied DatasetRef for naming. A call is made to should_be_cached() and if the DatasetRef should not be accepted None will be returned.

Cache expiry can occur during this.

Parameters:
urilsst.resources.ResourcePath

Location of the file to be relocated to the cache. Will be moved.

refDatasetRef

Ref associated with this file. Will be used to determine the name of the file within the cache.

Returns:
newlsst.resources.ResourcePath or None

URI to the file within the cache, or None if the dataset was not accepted by the cache.

abstract remove_from_cache(ref: DatasetRef | Iterable[DatasetRef]) None

Remove the specified datasets from the cache.

It is not an error for these datasets to be missing from the cache.

Parameters:
refDatasetRef or iterable of DatasetRef

The datasets to remove from the cache.

abstract should_be_cached(entity: DatasetRef | DatasetType | StorageClass) bool

Indicate whether the entity should be added to the cache.

This is relevant when reading or writing.

Parameters:
entityStorageClass or DatasetType or DatasetRef

Thing to test against the configuration. The name property is used to determine a match. A DatasetType will first check its name, before checking its StorageClass. If there are no matches the default will be returned.

Returns:
should_cachebool

Returns True if the dataset should be cached; False otherwise.