Butler¶
-
class
lsst.daf.butler.
Butler
(config=None, butler=None, collection=None, run=None, searchPaths=None)¶ Bases:
object
Main entry point for the data access system.
Parameters: - config :
ButlerConfig
,Config
orstr
, optional. Configuration. Anything acceptable to the
ButlerConfig
constructor. If a directory path is given the configuration will be read from abutler.yaml
file in that location. IfNone
is given default values will be used.- butler :
Butler
, optional. If provided, construct a new Butler that uses the same registry and datastore as the given one, but with the given collection and run. Incompatible with the
config
andsearchPaths
arguments.- collection :
str
, optional Collection to use for all input lookups, overriding config[“collection”] if provided.
- run :
str
,Run
, optional Collection associated with the
Run
to use for outputs, overriding config[“run”]. If aRun
associated with the given Collection does not exist, it will be created. If “collection” is None, this collection will be used for input lookups as well; if not, it must have the same value as “run”.- searchPaths :
list
ofstr
, optional Directory paths to search when calculating the full Butler configuration. Not used if the supplied config is already a
ButlerConfig
.
Raises: - ValueError
Raised if neither “collection” nor “run” are provided by argument or config, or if both are provided and are inconsistent.
Attributes: - config :
str
,ButlerConfig
orConfig
, optional (filename to) configuration. If this is not a
ButlerConfig
, defaults will be read. If astr
, may be the path to a directory containing a “butler.yaml” file.- datastore :
Datastore
Datastore to use for storage.
- registry :
Registry
Registry to use for lookups.
Attributes Summary
GENERATION
This is a Generation 3 Butler. Methods Summary
datasetExists
(datasetRefOrType[, dataId])Return True if the Dataset is actually present in the Datastore. get
(datasetRefOrType[, dataId, parameters])Retrieve a stored dataset. getDirect
(ref[, parameters])Retrieve a stored dataset. getUri
(datasetRefOrType[, dataId, predict])Return the URI to the Dataset. ingest
(path, datasetRefOrType[, dataId, …])Store and register a dataset that already exists on disk. makeRepo
(root[, config, standalone, …])Create an empty data repository by adding a butler.yaml config to a repository root directory. put
(obj, datasetRefOrType[, dataId, producer])Store and register a dataset. remove
(datasetRefOrType[, dataId, delete, …])Remove a dataset from the collection and possibly the repository. transaction
()Context manager supporting Butler
transactions.validateConfiguration
([logFailures, …])Validate butler configuration. Attributes Documentation
-
GENERATION
= 3¶ This is a Generation 3 Butler.
This attribute may be removed in the future, once the Generation 2 Butler interface has been fully retired; it should only be used in transitional code.
Methods Documentation
-
datasetExists
(datasetRefOrType, dataId=None, **kwds)¶ Return True if the Dataset is actually present in the Datastore.
Parameters: - datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
thedataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the first argument.- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Raises: - LookupError
Raised if the Dataset is not even present in the Registry.
- datasetRefOrType :
-
get
(datasetRefOrType, dataId=None, parameters=None, **kwds)¶ Retrieve a stored dataset.
Parameters: - datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
thedataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the first argument.- parameters :
dict
Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset.
- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Returns: - obj :
object
The dataset.
- datasetRefOrType :
-
getDirect
(ref, parameters=None)¶ Retrieve a stored dataset.
Unlike
Butler.get
, this method allows datasets outside the Butler’s collection to be read as long as theDatasetRef
that identifies them can be obtained separately.Parameters: - ref :
DatasetRef
Reference to an already stored dataset.
- parameters :
dict
Additional StorageClass-defined options to control reading, typically used to efficiently read only a subset of the dataset.
Returns: - obj :
object
The dataset.
- ref :
-
getUri
(datasetRefOrType, dataId=None, predict=False, **kwds)¶ Return the URI to the Dataset.
Parameters: - datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
thedataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the first argument.- predict :
bool
If
True
, allow URIs to be returned of datasets that have not been written.- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Returns: - uri :
str
URI string pointing to the Dataset within the datastore. If the Dataset does not exist in the datastore, and if
predict
isTrue
, the URI will be a prediction and will include a URI fragment “#predicted”. If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable.
Raises: - FileNotFoundError
A URI has been requested for a dataset that does not exist and guessing is not allowed.
- datasetRefOrType :
-
ingest
(path, datasetRefOrType, dataId=None, *, formatter=None, transfer=None, **kwds)¶ Store and register a dataset that already exists on disk.
Parameters: - path :
str
Path to the file containing the dataset.
- datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
is provided,dataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the second argument.- formatter :
Formatter
(optional) Formatter that should be used to retreive the Dataset. If not provided, the formatter will be constructed according to Datastore configuration.
- transfer : str (optional)
If not None, must be one of ‘move’, ‘copy’, ‘hardlink’, or ‘symlink’ indicating how to transfer the file.
- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Returns: - ref :
DatasetRef
A reference to the stored dataset, updated with the correct id if given.
Raises: - TypeError
Raised if the butler was not constructed with a Run, and is hence read-only.
- NotImplementedError
Raised if the
Datastore
does not support the given transfer mode.
- path :
-
static
makeRepo
(root, config=None, standalone=False, createRegistry=True, searchPaths=None, forceConfigRoot=True, outfile=None)¶ Create an empty data repository by adding a butler.yaml config to a repository root directory.
Parameters: - root :
str
Filesystem path to the root of the new repository. Will be created if it does not exist.
- config :
Config
orstr
, optional Configuration to write to the repository, after setting any root-dependent Registry or Datastore config options. Can not be a
ButlerConfig
or aConfigSubset
. IfNone
, default configuration will be used. Root-dependent config options specified in this config are overwritten ifforceConfigRoot
isTrue
.- standalone :
bool
If True, write all expanded defaults, not just customized or repository-specific settings. This (mostly) decouples the repository from the default configuration, insulating it from changes to the defaults (which may be good or bad, depending on the nature of the changes). Future additions to the defaults will still be picked up when initializing
Butlers
to repos created withstandalone=True
.- createRegistry :
bool
, optional If
True
create a new Registry.- searchPaths :
list
ofstr
, optional Directory paths to search when calculating the full butler configuration.
- forceConfigRoot :
bool
, optional If
False
, any values present in the suppliedconfig
that would normally be reset are not overridden and will appear directly in the output config. This allows non-standard overrides of the root directory for a datastore or registry to be given. If this parameter isTrue
the values forroot
will be forced into the resulting config if appropriate.- outfile :
str
, optional If not-
None
, the output configuration will be written to this location rather than into the repository itself.
Returns: Raises: - ValueError
Raised if a ButlerConfig or ConfigSubset is passed instead of a regular Config (as these subclasses would make it impossible to support
standalone=False
).- os.error
Raised if the directory does not exist, exists but is not a directory, or cannot be created.
Notes
Note that when
standalone=False
(the default), the configuration search path (seeConfigSubset.defaultSearchPaths
) that was used to construct the repository should also be used to construct any Butlers to avoid configuration inconsistencies.- root :
-
put
(obj, datasetRefOrType, dataId=None, producer=None, **kwds)¶ Store and register a dataset.
Parameters: - obj :
object
The dataset.
- datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
is provided,dataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the second argument.- producer :
Quantum
, optional The producer.
- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Returns: - ref :
DatasetRef
A reference to the stored dataset, updated with the correct id if given.
Raises: - TypeError
Raised if the butler was not constructed with a Run, and is hence read-only.
- obj :
-
remove
(datasetRefOrType, dataId=None, *, delete=True, remember=True, **kwds)¶ Remove a dataset from the collection and possibly the repository.
The identified dataset is always at least removed from the Butler’s collection. By default it is also deleted from the Datastore (e.g. files are actually deleted), but the dataset is “remembered” by retaining its row in the dataset and provenance tables in the registry.
If the dataset is a composite, all components will also be removed.
Parameters: - datasetRefOrType :
DatasetRef
,DatasetType
, orstr
When
DatasetRef
thedataId
should beNone
. Otherwise theDatasetType
or name thereof.- dataId :
dict
orDataId
A
dict
ofDimension
link name, value pairs that label theDatasetRef
within a Collection. WhenNone
, aDatasetRef
should be provided as the first argument.- delete :
bool
If
True
(default) actually delete the dataset from the Datastore (i.e. actually remove files).- remember :
bool
If
True
(default), retain dataset and provenance records in theRegistry
for this dataset.- kwds
Additional keyword arguments used to augment or construct a
DataId
. SeeDataId
parameters.
Raises: - datasetRefOrType :
-
validateConfiguration
(logFailures=False, datasetTypeNames=None, ignore=None)¶ Validate butler configuration.
Checks that each
DatasetType
can be stored in theDatastore
.Parameters: - logFailures :
bool
, optional If
True
, output a log message for every validation error detected.- datasetTypeNames : iterable of
str
, optional The
DatasetType
names that should be checked. This allows only a subset to be selected.- ignore : iterable of
str
, optional Names of DatasetTypes to skip over. This can be used to skip known problems. If a named
DatasetType
corresponds to a composite, all component of thatDatasetType
will also be ignored.
Raises: - ButlerValidationError
Raised if there is some inconsistency with how this Butler is configured.
- logFailures :
- config :