lsst.daf.butler¶
Using the Butler¶
This module provides an abstracted data access interface, known as the Butler. It can be used to read and write data without having to know the details of file formats or locations.
Design and Development¶
lsst.daf.butler
is developed at https://github.com/lsst/daf_butler.
You can find Jira issues for this module under the daf_butler component.
Butler Command Line Interface Development¶
The Dimensions System¶
Butler Command Reference¶
Python API reference¶
lsst.daf.butler Package¶
Functions¶
addDimensionForeignKey (tableSpec, dimension, …) |
Add a field and possibly a foreign key to a table specification that reference the table for the given Dimension . |
Classes¶
AmbiguousDatasetError |
Exception raised when a DatasetRef is not resolved (has no ID or run), but the requested operation requires one of them. |
Butler (config, str, None] = None, *, butler, …) |
Main entry point for the data access system. |
ButlerConfig (other, …) |
Contains the configuration for a Butler |
ButlerURI |
Convenience wrapper around URI parsers. |
ButlerValidationError |
There is a problem with the Butler configuration. |
CollectionSearch (collections, …]) |
An ordered search path of collections. |
CollectionType |
Enumeration used to label different types of collections. |
CompositesConfig ([other, validate, …]) |
|
CompositesMap (config, ButlerConfig, …) |
Determine whether a specific datasetType or StorageClass should be disassembled. |
Config ([other]) |
Implements a datatype that is used by Butler for configuration parameters. |
ConfigSubset ([other, validate, …]) |
Config representing a subset of a more general configuration. |
Constraints (config, str]], *, universe) |
Determine whether a DatasetRef , DatasetType , or StorageClass is allowed to be handled. |
ConstraintsConfig ([other]) |
Configuration information for Constraints |
ConstraintsValidationError |
Exception thrown when a constraints list has mutually exclusive definitions. |
DataCoordinate |
An immutable data ID dictionary that guarantees that its key-value pairs identify at least all required dimensions in a DimensionGraph . |
DataCoordinateIterable |
An abstract base class for homogeneous iterables of data IDs. |
DataCoordinateSequence (dataIds, graph, *, …) |
A DataCoordinateIterable implementation that supports the full collections.abc.Sequence interface. |
DataCoordinateSet (dataIds, graph, *, …) |
A DataCoordinateIterable implementation that adds some set-like functionality, and is backed by a true set-like object. |
DatabaseDimension (name, storage, *, …) |
A Dimension implementation that maps directly to a database table or query. |
DatabaseDimensionCombination (name, storage, …) |
A DimensionCombination implementation that maps directly to a database table or query. |
DatabaseDimensionElement (name, storage, *, …) |
An intermediate base class for DimensionElement classes whose instances that map directly to a database table or query. |
DatabaseTopologicalFamily (name, space, *, …) |
A TopologicalFamily implementation for the DatabaseDimension and DatabaseDimensionCombination objects that have direct database representations. |
DatasetAssociation (ref, collection, timespan) |
A struct that represents the membership of a single dataset in a single collection. |
DatasetComponent (name, storageClass, component) |
Component of a dataset and associated information. |
DatasetRef (datasetType, dataId, *, id, run, …) |
Reference to a Dataset in a Registry . |
DatasetType (name, dimensions, …) |
A named category of Datasets that defines how they are organized, related, and stored. |
DatasetTypeNotSupportedError |
A DatasetType is not handled by this routine. |
Datastore (config, str], bridgeManager, …) |
Datastore interface. |
DatastoreConfig ([other, validate, …]) |
|
DatastoreValidationError |
There is a problem with the Datastore configuration. |
DeferredDatasetHandle (butler, ref, parameters) |
Proxy class that provides deferred loading of a dataset from a butler. |
Dimension |
A named data-organization concept that can be used as a key in a data ID. |
DimensionCombination |
A DimensionElement that provides extra metadata and/or relationship endpoint information for a combination of dimensions. |
DimensionConfig (other, …) |
Configuration that defines a DimensionUniverse . |
DimensionElement |
A named data-organization concept that defines a label and/or metadata in the dimensions system. |
DimensionGraph |
An immutable, dependency-complete collection of dimensions. |
DimensionPacker (fixed, dimensions) |
An abstract base class for bidirectional mappings between a DataCoordinate and a packed integer ID. |
DimensionRecord (**kwargs) |
Base class for the Python representation of database records for a DimensionElement . |
DimensionUniverse |
A parent class that represents a complete, self-consistent set of dimensions and their relationships. |
FileDataset (path, refs, …) |
A struct that represents a dataset exported to a file. |
FileDescriptor (location, storageClass, …) |
Describes a particular file. |
FileTemplate (template) |
Format a path template into a fully expanded path. |
FileTemplateValidationError |
Exception thrown when a file template is not consistent with the associated DatasetType . |
FileTemplates (config, str], default, *, universe) |
Collection of FileTemplate templates. |
FileTemplatesConfig ([other]) |
Configuration information for FileTemplates |
Formatter (fileDescriptor, dataId, …) |
Interface for reading and writing Datasets with a particular StorageClass . |
FormatterFactory () |
Factory for Formatter instances. |
GovernorDimension (name, storage, *, …) |
A special Dimension with no dependencies and a small number of rows, used to group the dimensions that depend on it. |
Location (datastoreRootUri, …) |
Identifies a location within the Datastore . |
LocationFactory (datastoreRoot, str]) |
Factory for Location instances. |
LookupKey (name, dimensions, …) |
Representation of key that can be used to lookup information based on dataset type name, storage class name, dimensions. |
MappingFactory (refType) |
Register the mapping of some key to a python type and retrieve instances. |
NameMappingSetView (mapping, K_co]) |
A lightweight implementation of NamedValueAbstractSet backed by a mapping from name to named object. |
NamedKeyDict (*args) |
A dictionary wrapper that require keys to have a .name attribute, and permits lookups using either key objects or their names. |
NamedKeyMapping |
An abstract base class for custom mappings whose keys are objects with a str name attribute, for which lookups on the name as well as the object are permitted. |
NamedValueAbstractSet |
An abstract base class for custom sets whose elements are objects with a str name attribute, allowing some dict-like operations and views to be supported. |
NamedValueMutableSet |
An abstract base class that adds mutation interfaces to NamedValueAbstractSet . |
NamedValueSet (elements) |
A custom mutable set class that requires elements to have a .name attribute, which can then be used as keys in dict -like lookup. |
PruneCollectionsArgsError |
Base class for errors relating to Butler.pruneCollections input arguments. |
PurgeUnsupportedPruneCollectionsError (…) |
Raised when purge is True but is not supported for the given collection. |
PurgeWithoutUnstorePruneCollectionsError () |
Raised when purge and unstore are both required to be True, and purge is True but unstore is False. |
Quantum (*, taskName, taskClass, dataId, …) |
A discrete unit of work that may depend on one or more datasets and produces one or more datasets. |
Registry (database, defaults, managers) |
Registry interface. |
RegistryConfig ([other, validate, …]) |
|
RunWithoutPurgePruneCollectionsError (…) |
Raised when pruning a RUN collection but purge is False. |
SimpleQuery () |
A struct that combines SQLAlchemy objects representing SELECT, FROM, and WHERE clauses. |
SkyPixDimension (system, level) |
A special Dimension subclass for hierarchical pixelizations of the sky at a particular level. |
SkyPixSystem (name, *, maxLevel, …) |
A TopologicalFamily that represents a hierarchical pixelization of the sky. |
SpatialRegionDatabaseRepresentation (column, name) |
An object that encapsulates how spatial regions on the sky are represented in a database engine. |
StorageClass (name, pytype, str, …) |
Class describing how a label maps to a particular Python type. |
StorageClassConfig ([other, validate, …]) |
|
StorageClassDelegate (storageClass) |
Class to delegate the handling of components and parameters for the python type associated with a particular StorageClass . |
StorageClassFactory (config, str, None] = None) |
Factory for StorageClass instances. |
StoredDatastoreItemInfo |
Internal information associated with a stored dataset in a Datastore . |
StoredFileInfo (formatter, …) |
Datastore-private metadata associated with a file stored in a Datastore. |
Timespan (begin, …) |
A half-open time interval with nanosecond precision. |
TimespanDatabaseRepresentation |
An interface that encapsulates how timespans are represented in a database engine. |
TopologicalExtentDatabaseRepresentation |
An abstract base class whose subclasses provide a mapping from the in-memory representation of a TopologicalSpace region to a database-storage representation, and whose instances act like a SQLAlchemy-based column expression. |
TopologicalFamily (name, space) |
A grouping of TopologicalRelationshipEndpoint objects whose regions form a hierarchy in which one endpoint’s rows always contain another’s in a predefined way. |
TopologicalRelationshipEndpoint |
An abstract base class whose instances represent a logical table that may participate in overlap joins defined by a TopologicalSpace . |
TopologicalSpace |
Enumeration of the different categories of continuous-variable relationships supported by the dimensions system. |
ValidationError |
Some sort of validation error has occurred. |
YamlRepoExportBackend (stream) |
A repository export implementation that saves to a YAML file. |
YamlRepoImportBackend (stream, registry) |
A repository import implementation that reads from a YAML file. |
Class Inheritance Diagram¶
lsst.daf.butler.registry Package¶
Classes¶
CollectionSearch (collections, …]) |
An ordered search path of collections. |
CollectionType |
Enumeration used to label different types of collections. |
ConflictingDefinitionError |
Exception raised when trying to insert a database record when a conflicting record already exists. |
DbAuth (path, envVar, authList, str]]] = None) |
Retrieves authentication information for database connections. |
DbAuthError |
A problem has occurred retrieving database authentication information. |
DbAuthPermissionsError |
Credentials file has incorrect permissions. |
InconsistentDataIdError |
Exception raised when a data ID contains contradictory key-value pairs, according to dimension relationships. |
MissingCollectionError |
Exception raised when an operation attempts to use a collection that does not exist. |
OrphanedRecordError |
Exception raised when trying to remove or modify a database record that is still being used in some other table. |
Registry (database, defaults, managers) |
Registry interface. |
RegistryConfig ([other, validate, …]) |
|
RegistryDefaults (collections, run, infer, …) |
A struct used to provide the default collections searched or written to by a Registry or Butler instance. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.interfaces Package¶
Classes¶
ButlerAttributeExistsError |
Exception raised when trying to update existing attribute without specifying force option. |
ButlerAttributeManager |
An interface for managing butler attributes in a Registry . |
ChainedCollectionRecord (key, name, universe) |
A subclass of CollectionRecord that adds the list of child collections in a CHAINED collection. |
CollectionManager |
An interface for managing the collections (including runs) in a Registry . |
CollectionRecord (key, name, type) |
A struct used to represent a collection in internal Registry APIs. |
Database (*, origin, connection, namespace) |
An abstract interface that represents a particular database engine’s representation of a single schema/namespace/database. |
DatabaseConflictError |
Exception raised when database content (row values or schema entities) are inconsistent with what this client expects. |
DatabaseDimensionOverlapStorage |
A base class for objects that manage overlaps between a pair of database-backed dimensions. |
DatabaseDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for DatabaseDimensionElement instances. |
DatasetRecordStorage (datasetType) |
An interface that manages the records associated with a particular DatasetType . |
DatasetRecordStorageManager |
An interface that manages the tables that describe datasets. |
DatastoreRegistryBridge (datastoreName) |
An abstract base class that defines the interface that a Datastore uses to communicate with a Registry . |
DatastoreRegistryBridgeManager (*, opaque, …) |
An abstract base class that defines the interface between Registry and Datastore when a new Datastore is constructed. |
DimensionRecordStorage |
An abstract base class that represents a way of storing the records associated with a single DimensionElement . |
DimensionRecordStorageManager (*, universe) |
An interface for managing the dimension records in a Registry . |
FakeDatasetRef (id) |
A fake DatasetRef that can be used internally by butler where only the dataset ID is available. |
GovernorDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for GovernorDimension instances. |
OpaqueTableStorage (name) |
An interface that manages the records associated with a particular opaque table in a Registry . |
OpaqueTableStorageManager |
An interface that manages the opaque tables in a Registry . |
ReadOnlyDatabaseError |
Exception raised when a write operation is called on a read-only Database . |
RunRecord (key, name, type) |
A subclass of CollectionRecord that adds execution information and an interface for updating it. |
SchemaAlreadyDefinedError |
Exception raised when trying to initialize database schema when some tables already exist. |
SkyPixDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for SkyPixDimension instances. |
StaticTablesContext (db) |
Helper class used to declare the static schema for a registry layer in a database. |
VersionTuple |
Class representing a version number. |
VersionedExtension |
Interface for extension classes with versions. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.queries Package¶
Classes¶
ChainedDatasetQueryResults (chain) |
A DatasetQueryResults implementation that simply chains together other results objects, each for a different parent dataset type. |
DataCoordinateQueryResults (db, query, *, …) |
An enhanced implementation of DataCoordinateIterable that represents data IDs retrieved from a database query. |
DatasetQueryResults |
An interface for objects that represent the results of queries for datasets. |
ParentDatasetQueryResults (db, query, *, …) |
An object that represents results from a query for datasets with a single parent DatasetType . |
Query (*, graph, whereRegion, managers) |
An abstract base class for queries that return some combination of DatasetRef and DataCoordinate objects. |
QueryBuilder (summary, managers) |
A builder for potentially complex queries that join tables based on dimension relationships. |
QuerySummary (requested, *, dataId, …) |
A struct that holds and categorizes the dimensions involved in a query. |
RegistryManagers (collections, datasets, …) |
Struct used to pass around the manager objects that back a Registry and are used internally by the query system. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.wildcards Module¶
Classes¶
CategorizedWildcard (strings, patterns, …) |
The results of preprocessing a wildcard expression to separate match patterns from strings. |
CollectionQuery (search, …) |
An unordered query for collections and dataset type restrictions. |
CollectionSearch (collections, …]) |
An ordered search path of collections. |
Class Inheritance Diagram¶
Example datastores¶
lsst.daf.butler.datastores.chainedDatastore Module¶
Classes¶
ChainedDatastore (config, str], …) |
Chained Datastores to allow read and writes from multiple datastores. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.inMemoryDatastore Module¶
Classes¶
StoredMemoryItemInfo (timestamp, …) |
Internal InMemoryDatastore Metadata associated with a stored DatasetRef. |
InMemoryDatastore (config, str], …) |
Basic Datastore for writing to an in memory cache. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.posixDatastore Module¶
Classes¶
PosixDatastore (config, str], bridgeManager, …) |
Basic POSIX filesystem backed Datastore. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.s3Datastore Module¶
Classes¶
S3Datastore (config, str], bridgeManager, …) |
Basic S3 Object Storage backed Datastore. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.webdavDatastore Module¶
Classes¶
WebdavDatastore (config, str], bridgeManager, …) |
Basic Webdav Storage backed Datastore. |
Class Inheritance Diagram¶
Example formatters¶
lsst.daf.butler.formatters.file Module¶
Classes¶
FileFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing files on a POSIX file system. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.json Module¶
Classes¶
JsonFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from JSON files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.matplotlib Module¶
Classes¶
MatplotlibFormatter (fileDescriptor, dataId, …) |
Interface for writing matplotlib figures. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.parquet Module¶
Classes¶
ParquetFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Pandas DataFrames to and from Parquet files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.pickle Module¶
Classes¶
PickleFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from pickle files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.yaml Module¶
Classes¶
YamlFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from YAML files. |
Class Inheritance Diagram¶
Database backends¶
lsst.daf.butler.registry.databases.sqlite Module¶
Classes¶
SqliteDatabase (*, connection, origin, …) |
An implementation of the Database interface for SQLite3. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.databases.postgresql Module¶
Classes¶
PostgresqlDatabase (*, connection, origin, …) |
An implementation of the Database interface for PostgreSQL. |
Class Inheritance Diagram¶
Support API¶
lsst.daf.butler.core.utils Module¶
Functions¶
allSlots (self) |
Return combined __slots__ for all classes in objects mro. |
getClassOf (typeOrName, str]) |
Given the type name or a type, return the python type. |
getFullTypeName (cls) |
Return full type name of the supplied entity. |
getInstanceOf (typeOrName, str], *args, **kwargs) |
Given the type name or a type, instantiate an object of that type. |
immutable (cls) |
A class decorator that simulates a simple form of immutability for the decorated class. |
iterable (a) |
Make input iterable. |
safeMakeDir (directory) |
Make a directory in a manner avoiding race conditions |
stripIfNotNone (s) |
Strip leading and trailing whitespace if the given object is not None. |
transactional (func) |
Decorator that wraps a method and makes it transactional. |
Class Inheritance Diagram¶
lsst.daf.butler.core.repoRelocation Module¶
Functions¶
replaceRoot (configRoot, butlerRoot, str, None]) |
Update a configuration root with the butler root location. |
Variables¶
BUTLER_ROOT_TAG |
The special string to be used in configuration files to indicate that the butler root location should be used. |
Test utilities¶
lsst.daf.butler.tests Package¶
Functions¶
addDataIdValue (butler, dimension, value, …) |
Add a new data ID to a repository. |
addDatasetType (butler, name, dimensions, …) |
Add a new dataset type to a repository. |
expandUniqueId (butler, partialId, Any]) |
Return a complete data ID matching some criterion. |
makeTestCollection (repo) |
Create a read/write Butler to a fresh collection. |
makeTestRepo (root, dataIds, …) |
Create an empty test repository. |
registerMetricsExample (butler) |
Modify a repository to support reading and writing MetricsExample objects. |
Classes¶
BadNoWriteFormatter (fileDescriptor, dataId, …) |
A formatter that always fails without writing anything. |
BadWriteFormatter (fileDescriptor, dataId, …) |
A formatter that never works but does leave a file behind. |
CliCmdTestBase |
A test case base that is used to verify click command functions import and call their respective script functions correctly. |
CliLogTestBase |
Tests log initialization, reset, and setting log levels. |
DatasetTestHelper |
Helper methods for Datasets |
DatastoreMock |
Mocks a butler datastore. |
DatastoreTestHelper |
Helper methods for Datastore tests |
DummyRegistry () |
Dummy Registry, for Datastore test purposes. |
ListDelegate (storageClass) |
Parameter handler for list parameters |
MetricsDelegate (storageClass) |
Parameter handler for parameters using Metrics |
MetricsExample ([summary, output, data]) |
Smorgasboard of information that might be the result of some processing. |
MultiDetectorFormatter (fileDescriptor, …) |