lsst.daf.butler¶
Using the Butler¶
This module provides an abstracted data access interface, known as the Butler. It can be used to read and write data without having to know the details of file formats or locations.
The Dimensions System¶
Butler Command-Line Reference¶
Design and Development¶
lsst.daf.butler
is developed at https://github.com/lsst/daf_butler.
You can find Jira issues for this module under the daf_butler component.
Butler Command Line Interface Development¶
Python API reference¶
lsst.daf.butler Package¶
Functions¶
addDimensionForeignKey (tableSpec, dimension, …) |
Add a field and possibly a foreign key to a table specification. |
Classes¶
AbstractDatastoreCacheManager (config, …) |
An abstract base class for managing caching in a Datastore. |
AmbiguousDatasetError |
Raised when a DatasetRef is not resolved but should be. |
Butler (config, str, None] = None, *, butler, …) |
Main entry point for the data access system. |
ButlerConfig (other, …) |
Contains the configuration for a Butler |
ButlerURI |
Convenience wrapper around URI parsers. |
ButlerValidationError |
There is a problem with the Butler configuration. |
CollectionSearch |
An ordered search path of collections. |
CollectionType |
Enumeration used to label different types of collections. |
CompositesConfig ([other, validate, …]) |
Configuration specifics for Composites. |
CompositesMap (config, ButlerConfig, …) |
Determine whether something should be disassembled. |
Config ([other]) |
Implements a datatype that is used by Butler for configuration. |
ConfigSubset ([other, validate, …]) |
Config representing a subset of a more general configuration. |
Constraints (config, str]], *, universe) |
Determine whether an entity is allowed to be handled. |
ConstraintsConfig ([other]) |
Configuration information for Constraints . |
ConstraintsValidationError |
Thrown when a constraints list has mutually exclusive definitions. |
DataCoordinate |
Data ID dictionary. |
DataCoordinateIterable |
An abstract base class for homogeneous iterables of data IDs. |
DataCoordinateSequence (dataIds, graph, *, …) |
Iterable supporting the full Sequence interface. |
DataCoordinateSet (dataIds, graph, *, …) |
Iterable iteration that is set-like. |
DatabaseDimension (name, storage, *, …) |
A Dimension class that maps directly to a database table or query. |
DatabaseDimensionCombination (name, storage, …) |
A combination class that maps directly to a database table or query. |
DatabaseDimensionElement (name, storage, *, …) |
An intermediate base class for DimensionElement database classes. |
DatabaseTopologicalFamily (name, space, *, …) |
Database topological family implementation. |
DatasetAssociation (ref, collection, timespan) |
Class representing the membership of a dataset in a single collection. |
DatasetComponent (name, storageClass, component) |
Component of a dataset and associated information. |
DatasetIdGenEnum |
This enum is used to specify dataset ID generation options for insert() method. |
DatasetRef (datasetType, dataId, *, id, …) |
Reference to a Dataset in a Registry . |
DatasetType (name, dimensions, …) |
A named category of Datasets. |
DatasetTypeNotSupportedError |
A DatasetType is not handled by this routine. |
Datastore (config, str], bridgeManager, …) |
Datastore interface. |
DatastoreCacheManager (config, …) |
A class for managing caching in a Datastore using local files. |
DatastoreCacheManagerConfig ([other, …]) |
Configuration information for DatastoreCacheManager . |
DatastoreConfig ([other, validate, …]) |
Configuration for Datastores. |
DatastoreDisabledCacheManager (config, …) |
A variant of the datastore cache where no cache is enabled. |
DatastoreValidationError |
There is a problem with the Datastore configuration. |
DeferredDatasetHandle (butler, ref, parameters) |
Proxy class that provides deferred loading of a dataset from a butler. |
Dimension |
A dimension. |
DimensionCombination |
Element with extra information. |
DimensionConfig (other, …) |
Configuration that defines a DimensionUniverse . |
DimensionElement |
A label and/or metadata in the dimensions system. |
DimensionGraph |
An immutable, dependency-complete collection of dimensions. |
DimensionPacker (fixed, dimensions) |
Class for going from DataCoordinate to packed integer ID and back. |
DimensionRecord (**kwargs) |
Base class for the Python representation of database records. |
DimensionUniverse |
Self-consistent set of dimensions. |
FileDataset (path, refs, …) |
A struct that represents a dataset exported to a file. |
FileDescriptor (location, storageClass, …) |
Describes a particular file. |
FileTemplate (template) |
Format a path template into a fully expanded path. |
FileTemplateValidationError |
Exception for file template inconsistent with associated DatasetType. |
FileTemplates (config, str], default, *, universe) |
Collection of FileTemplate templates. |
FileTemplatesConfig ([other]) |
Configuration information for FileTemplates . |
Formatter (fileDescriptor, dataId, …) |
Interface for reading and writing Datasets. |
FormatterFactory () |
Factory for Formatter instances. |
GovernorDimension (name, storage, *, …) |
Governor dimension. |
Location (datastoreRootUri, …) |
Identifies a location within the Datastore . |
LocationFactory (datastoreRoot, str]) |
Factory for Location instances. |
LookupKey (name, dimensions, …) |
Representation of key that can be used to lookup information. |
MappingFactory (refType) |
Register the mapping of some key to a python type and retrieve instances. |
NameMappingSetView (mapping, K_co]) |
A lightweight implementation of NamedValueAbstractSet . |
NamedKeyDict (*args) |
Dictionary wrapper for named keys. |
NamedKeyMapping |
Custom mapping class. |
NamedValueAbstractSet |
Custom sets with named elements. |
NamedValueMutableSet |
Mutable variant of NamedValueAbstractSet . |
NamedValueSet (elements) |
Custom mutable set class. |
Progress (name, level) |
Public interface for reporting incremental progress in the butler and related tools. |
PruneCollectionsArgsError |
Base class for errors relating to Butler.pruneCollections input arguments. |
PurgeUnsupportedPruneCollectionsError (…) |
Raised when purge is True but is not supported for the given collection. |
PurgeWithoutUnstorePruneCollectionsError () |
Raised when purge and unstore are both required to be True, and purge is True but unstore is False. |
Quantum (*, taskName, taskClass, dataId, …) |
Class representing a discrete unit of work. |
Registry |
Abstract Registry interface. |
RegistryConfig ([other, validate, …]) |
|
RunWithoutPurgePruneCollectionsError (…) |
Raised when pruning a RUN collection but purge is False. |
SerializedDataCoordinate |
Simplified model for serializing a DataCoordinate . |
SerializedDatasetRef |
Simplified model of a DatasetRef suitable for serialization. |
SerializedDatasetType |
Simplified model of a DatasetType suitable for serialization. |
SerializedDimensionGraph |
Simplified model of a DimensionGraph suitable for serialization. |
SerializedDimensionRecord |
Simplified model for serializing a DimensionRecord . |
SimpleQuery () |
A struct that combines SQLAlchemy objects. |
SkyPixDimension (system, level) |
Special dimension for sky pixelizations. |
SkyPixSystem (name, *, maxLevel, …) |
Class for hierarchical pixelization of the sky. |
SpatialRegionDatabaseRepresentation (column, name) |
Class reflecting how spatial regions are represented inside the DB. |
StorageClass (name, pytype, str, …) |
Class describing how a label maps to a particular Python type. |
StorageClassConfig ([other, validate, …]) |
Configuration class for defining Storage Classes. |
StorageClassDelegate (storageClass) |
Delegate class for StorageClass components and parameters. |
StorageClassFactory (config, str, None] = None) |
Factory for StorageClass instances. |
StoredDatastoreItemInfo |
Internal information associated with a stored dataset in a Datastore . |
StoredFileInfo (formatter, …) |
Datastore-private metadata associated with a Datastore file. |
Timespan (begin, …) |
A half-open time interval with nanosecond precision. |
TimespanDatabaseRepresentation |
Representation of a time span within a database engine. |
TopologicalExtentDatabaseRepresentation |
Mapping of in-memory representation of a region to DB representation. |
TopologicalFamily (name, space) |
A grouping of TopologicalRelationshipEndpoint objects. |
TopologicalRelationshipEndpoint |
Representation of a logical table that can participate in overlap joins. |
TopologicalSpace |
Enumeration of continuous-variable relationships for dimensions. |
ValidationError |
Some sort of validation error has occurred. |
YamlRepoExportBackend (stream) |
A repository export implementation that saves to a YAML file. |
YamlRepoImportBackend (stream, registry) |
A repository import implementation that reads from a YAML file. |
Class Inheritance Diagram¶
lsst.daf.butler.registry Package¶
Classes¶
CollectionSearch |
An ordered search path of collections. |
CollectionType |
Enumeration used to label different types of collections. |
ConflictingDefinitionError |
Exception raised when trying to insert a database record when a conflicting record already exists. |
DatasetIdGenEnum |
This enum is used to specify dataset ID generation options for insert() method. |
DbAuth (path, envVar, authList, str]]] = None) |
Retrieves authentication information for database connections. |
DbAuthError |
A problem has occurred retrieving database authentication information. |
DbAuthPermissionsError |
Credentials file has incorrect permissions. |
InconsistentDataIdError |
Exception raised when a data ID contains contradictory key-value pairs, according to dimension relationships. |
MissingCollectionError |
Exception raised when an operation attempts to use a collection that does not exist. |
OrphanedRecordError |
Exception raised when trying to remove or modify a database record that is still being used in some other table. |
Registry |
Abstract Registry interface. |
RegistryConfig ([other, validate, …]) |
|
RegistryDefaults (collections, run, infer, …) |
A struct used to provide the default collections searched or written to by a Registry or Butler instance. |
UnsupportedIdGeneratorError |
Exception raised when an unsupported DatasetIdGenEnum option is used for insert/import. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.interfaces Package¶
Classes¶
ButlerAttributeExistsError |
Exception raised when trying to update existing attribute without specifying force option. |
ButlerAttributeManager |
An interface for managing butler attributes in a Registry . |
ChainedCollectionRecord (key, name, universe) |
A subclass of CollectionRecord that adds the list of child collections in a CHAINED collection. |
CollectionManager |
An interface for managing the collections (including runs) in a Registry . |
CollectionRecord (key, name, type) |
A struct used to represent a collection in internal Registry APIs. |
Database (*, origin, engine, namespace) |
An abstract interface that represents a particular database engine’s representation of a single schema/namespace/database. |
DatabaseConflictError |
Exception raised when database content (row values or schema entities) are inconsistent with what this client expects. |
DatabaseDimensionOverlapStorage |
A base class for objects that manage overlaps between a pair of database-backed dimensions. |
DatabaseDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for DatabaseDimensionElement instances. |
DatasetIdGenEnum |
This enum is used to specify dataset ID generation options for insert() method. |
DatasetRecordStorage (datasetType) |
An interface that manages the records associated with a particular DatasetType . |
DatasetRecordStorageManager |
An interface that manages the tables that describe datasets. |
DatastoreRegistryBridge (datastoreName) |
An abstract base class that defines the interface that a Datastore uses to communicate with a Registry . |
DatastoreRegistryBridgeManager (*, opaque, …) |
An abstract base class that defines the interface between Registry and Datastore when a new Datastore is constructed. |
DimensionRecordStorage |
An abstract base class that represents a way of storing the records associated with a single DimensionElement . |
DimensionRecordStorageManager (*, universe) |
An interface for managing the dimension records in a Registry . |
FakeDatasetRef (id, uuid.UUID]) |
A fake DatasetRef that can be used internally by butler where only the dataset ID is available. |
GovernorDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for GovernorDimension instances. |
OpaqueTableStorage (name) |
An interface that manages the records associated with a particular opaque table in a Registry . |
OpaqueTableStorageManager |
An interface that manages the opaque tables in a Registry . |
ReadOnlyDatabaseError |
Exception raised when a write operation is called on a read-only Database . |
RunRecord (key, name, type) |
A subclass of CollectionRecord that adds execution information and an interface for updating it. |
SchemaAlreadyDefinedError |
Exception raised when trying to initialize database schema when some tables already exist. |
SkyPixDimensionRecordStorage |
Intermediate interface for DimensionRecordStorage objects that provide storage for SkyPixDimension instances. |
StaticTablesContext (db) |
Helper class used to declare the static schema for a registry layer in a database. |
VersionTuple |
Class representing a version number. |
VersionedExtension |
Interface for extension classes with versions. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.queries Package¶
Classes¶
ChainedDatasetQueryResults (chain) |
A DatasetQueryResults implementation that simply chains together other results objects, each for a different parent dataset type. |
DataCoordinateQueryResults (db, query, *, …) |
An enhanced implementation of DataCoordinateIterable that represents data IDs retrieved from a database query. |
DatasetQueryResults |
An interface for objects that represent the results of queries for datasets. |
ParentDatasetQueryResults (db, query, *, …) |
An object that represents results from a query for datasets with a single parent DatasetType . |
Query (*, graph, whereRegion, managers) |
An abstract base class for queries that return some combination of DatasetRef and DataCoordinate objects. |
QueryBuilder (summary, managers) |
A builder for potentially complex queries that join tables based on dimension relationships. |
QuerySummary (requested, *, dataId, …) |
A struct that holds and categorizes the dimensions involved in a query. |
RegistryManagers (collections, datasets, …) |
Struct used to pass around the manager objects that back a Registry and are used internally by the query system. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.wildcards Module¶
Classes¶
CategorizedWildcard (strings, patterns, …) |
The results of preprocessing a wildcard expression to separate match patterns from strings. |
CollectionQuery (search, …) |
An unordered query for collections and dataset type restrictions. |
CollectionSearch |
An ordered search path of collections. |
Class Inheritance Diagram¶
Example datastores¶
lsst.daf.butler.datastores.chainedDatastore Module¶
Classes¶
ChainedDatastore (config, str], …) |
Chained Datastores to allow read and writes from multiple datastores. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.inMemoryDatastore Module¶
Classes¶
StoredMemoryItemInfo (timestamp, …) |
Internal InMemoryDatastore Metadata associated with a stored DatasetRef. |
InMemoryDatastore (config, str], …) |
Basic Datastore for writing to an in memory cache. |
Class Inheritance Diagram¶
lsst.daf.butler.datastores.fileDatastore Module¶
Classes¶
FileDatastore (config, str], bridgeManager, …) |
Generic Datastore for file-based implementations. |
Class Inheritance Diagram¶
Example formatters¶
lsst.daf.butler.formatters.file Module¶
Classes¶
FileFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing files on a POSIX file system. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.json Module¶
Classes¶
JsonFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from JSON files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.matplotlib Module¶
Classes¶
MatplotlibFormatter (fileDescriptor, dataId, …) |
Interface for writing matplotlib figures. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.parquet Module¶
Classes¶
ParquetFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Pandas DataFrames to and from Parquet files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.pickle Module¶
Classes¶
PickleFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from pickle files. |
Class Inheritance Diagram¶
lsst.daf.butler.formatters.yaml Module¶
Classes¶
YamlFormatter (fileDescriptor, dataId, …) |
Interface for reading and writing Python objects to and from YAML files. |
Class Inheritance Diagram¶
Database backends¶
lsst.daf.butler.registry.databases.sqlite Module¶
Classes¶
SqliteDatabase (*, engine, origin, namespace, …) |
An implementation of the Database interface for SQLite3. |
Class Inheritance Diagram¶
lsst.daf.butler.registry.databases.postgresql Module¶
Classes¶
PostgresqlDatabase (*, engine, origin, …) |
An implementation of the Database interface for PostgreSQL. |
Class Inheritance Diagram¶
Support API¶
lsst.daf.butler.core.utils Module¶
Functions¶
allSlots (self) |
Return combined __slots__ for all classes in objects mro. |
getClassOf (typeOrName, str]) |
Given the type name or a type, return the python type. |
getFullTypeName (cls) |
Return full type name of the supplied entity. |
getInstanceOf (typeOrName, str], *args, **kwargs) |
Given the type name or a type, instantiate an object of that type. |
immutable (cls) |
Decorate a class to simulates a simple form of immutability. |
iterable (a) |
Make input iterable. |
safeMakeDir (directory) |
Make a directory in a manner avoiding race conditions. |
stripIfNotNone (s) |
Strip leading and trailing whitespace if the given object is not None. |
transactional (func) |
Decorate a method and makes it transactional. |
Class Inheritance Diagram¶
lsst.daf.butler.core.repoRelocation Module¶
Functions¶
replaceRoot (configRoot, butlerRoot, str, None]) |
Update a configuration root with the butler root location. |
Variables¶
BUTLER_ROOT_TAG |
The special string to be used in configuration files to indicate that the butler root location should be used. |
Test utilities¶
lsst.daf.butler.tests Package¶
Functions¶
addDataIdValue (butler, dimension, value, …) |
Add a new data ID to a repository. |
addDatasetType (butler, name, dimensions, …) |
Add a new dataset type to a repository. |
expandUniqueId (butler, partialId, Any]) |
Return a complete data ID matching some criterion. |
makeTestCollection (repo, uniqueId) |
Create a read/write Butler to a fresh collection. |
makeTestRepo (root, dataIds, …) |
Create an empty test repository. |
registerMetricsExample (butler) |
Modify a repository to support reading and writing MetricsExample objects. |
Classes¶
BadNoWriteFormatter (fileDescriptor, dataId, …) |
A formatter that always fails without writing anything. |
BadWriteFormatter (fileDescriptor, dataId, …) |
A formatter that never works but does leave a file behind. |
CliCmdTestBase |
A test case base that is used to verify click command functions import and call their respective script functions correctly. |
CliLogTestBase |
Tests log initialization, reset, and setting log levels. |
DatasetTestHelper |
Helper methods for Datasets |
DatastoreMock |
Mocks a butler datastore. |
DatastoreTestHelper |
Helper methods for Datastore tests |
DummyRegistry () |
Dummy Registry, for Datastore test purposes. |
ListDelegate (storageClass) |
Parameter handler for list parameters |
MetricsDelegate (storageClass) |
Parameter handler for parameters using Metrics |
MetricsExample ([summary, output, data]) |
Smorgasboard of information that might be the result of some processing. |
MultiDetectorFormatter (fileDescriptor, …) |