Butler

class lsst.daf.butler.Butler(config=None, collection=None, run=None)

Bases: object

Main entry point for the data access system.

Parameters:
config : Config

Configuration.

collection : str, optional

Collection to use for all input lookups, overriding config[“collection”] if provided.

run : str, Run, optional

Collection associated with the Run to use for outputs, overriding config[“run”]. If a Run associated with the given Collection does not exist, it will be created. If “collection” is None, this collection will be used for input lookups as well; if not, it must have the same value as “run”.

Raises:
ValueError

Raised if neither “collection” nor “run” are provided by argument or config, or if both are provided and are inconsistent.

Attributes:
config : str, ButlerConfig or Config, optional

(filename to) configuration. If this is not a ButlerConfig, defaults will be read. If a str, may be the path to a directory containing a “butler.yaml” file.

datastore : Datastore

Datastore to use for storage.

registry : Registry

Registry to use for lookups.

Methods Summary

datasetExists(datasetType, dataId) Return True if the Dataset is actually present in the Datastore.
get(datasetRefOrType[, dataId]) Retrieve a stored dataset.
getDirect(ref) Retrieve a stored dataset.
getUri(datasetType, dataId[, predict]) Return the URI to the Dataset.
makeRepo(root[, config, standalone, …]) Create an empty data repository by adding a butler.yaml config to a repository root directory.
put(obj, datasetRefOrType[, dataId, producer]) Store and register a dataset.
transaction() Context manager supporting Butler transactions.

Methods Documentation

datasetExists(datasetType, dataId)

Return True if the Dataset is actually present in the Datastore.

Parameters:
datasetType : DatasetType instance or str

The DatasetType.

dataId : dict

A dict of DataUnit link name, value pairs that label the DatasetRef within a Collection.

Raises:
LookupError

Raised if the Dataset is not even present in the Registry.

get(datasetRefOrType, dataId=None)

Retrieve a stored dataset.

Parameters:
datasetRefOrType : DatasetRef, DatasetType instance or str

When DatasetRef the dataId should be None. Otherwise the DatasetType or name thereof.

dataId : dict

A dict of DataUnit link name, value pairs that label the DatasetRef within a Collection. When None a DatasetRef should be supplied as the second argument.

Returns:
obj : object

The dataset.

getDirect(ref)

Retrieve a stored dataset.

Unlike Butler.get, this method allows datasets outside the Butler’s collection to be read as long as the DatasetRef that identifies them can be obtained separately.

Parameters:
ref : DatasetRef

Reference to an already stored dataset.

Returns:
obj : object

The dataset.

getUri(datasetType, dataId, predict=False)

Return the URI to the Dataset.

Parameters:
datasetType : DatasetType instance or str

The DatasetType.

dataId : dict

A dict of DataUnit link name, value pairs that label the DatasetRef within a Collection.

predict : bool

If True, allow URIs to be returned of datasets that have not been written.

Returns:
uri : str

URI string pointing to the Dataset within the datastore. If the Dataset does not exist in the datastore, and if predict is True, the URI will be a prediction and will include a URI fragment “#predicted”. If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable.

Raises:
FileNotFoundError

A URI has been requested for a dataset that does not exist and guessing is not allowed.

static makeRepo(root, config=None, standalone=False, createRegistry=True)

Create an empty data repository by adding a butler.yaml config to a repository root directory.

Parameters:
root : str

Filesystem path to the root of the new repository. Will be created if it does not exist.

config : Config, optional

Configuration to write to the repository, after setting any root-dependent Registry or Datastore config options. If None, default configuration will be used.

standalone : bool

If True, write all expanded defaults, not just customized or repository-specific settings. This (mostly) decouples the repository from the default configuration, insulating it from changes to the defaults (which may be good or bad, depending on the nature of the changes). Future additions to the defaults will still be picked up when initializing Butlers to repos created with standalone=True.

createRegistry : bool

If True create a new Registry.

Note that when ``standalone=False`` (the default), the configuration
search path (see `ConfigSubset.defaultSearchPaths`) that was used to
construct the repository should also be used to construct any Butlers
to it to avoid configuration inconsistencies.
Returns:
config : Config

The updated Config instance written to the repo.

Raises:
ValueError

Raised if a ButlerConfig or ConfigSubset is passed instead of a regular Config (as these subclasses would make it impossible to support standalone=False).

os.error

Raised if the directory does not exist, exists but is not a directory, or cannot be created.

put(obj, datasetRefOrType, dataId=None, producer=None)

Store and register a dataset.

Parameters:
obj : object

The dataset.

datasetRefOrType : DatasetRef, DatasetType instance or str

When DatasetRef the dataId should be None. Otherwise the DatasetType or name thereof.

dataId : dict, optional

An identifier with DataUnit names and values. When None a DatasetRef should be supplied as the second argument.

producer : Quantum, optional

The producer.

Returns:
ref : DatasetRef

A reference to the stored dataset, updated with the correct id if given.

Raises:
TypeError

Raised if the butler was not constructed with a Run, and is hence read-only.

transaction()

Context manager supporting Butler transactions.

Transactions can be nested.