lsst.daf.butler

This module provides an abstracted data access interface, known as the Butler. It can be used to read and write data without having to know the details of file formats or locations.

Contributing

lsst.daf.butler is developed at https://github.com/lsst/daf_butler. You can find Jira issues for this module under the daf_butler component.

Command Line Scripts

makeButlerRepo.py

Create an empty Gen3 Butler repository.

usage: makeButlerRepo.py [-h] [-c CONFIG] [--standalone] [--outfile OUTFILE]
                         [--verbose] [--override]
                         root

positional arguments

root

Filesystem path for the new repository. Will be created if it does not exist.

optional arguments

-h, --help

show this help message and exit

-c <config>, --config <config>

Path to an existing YAML config file to apply (on top of defaults).

--standalone

Include all defaults in the config file in the repo, insulating the repo from changes in package defaults.

--outfile <outfile>, -f <outfile>

Name of output file to receive repository configuration. Default is to write butler.yaml into the specified root.

--verbose, -v

Turn on debug reporting.

--override, -o

Allow values in the supplied config to override any root settings.

dumpButlerConfig.py

Dump either a subset or full Butler configuration to standard output.

usage: dumpButlerConfig.py [-h] [--subset SUBSET] [--searchpath SEARCHPATH]
                           [--verbose]
                           root

positional arguments

root

Filesystem path for an existing Butler repository or path to config file.

optional arguments

-h, --help

show this help message and exit

--subset <subset>, -s <subset>

Subset of a configuration to report. This can be any key in the hierarchy such as ‘.datastore.root’ where the leading ‘.’ specified the delimiter for the hierarchy.

--searchpath <searchpath>, -p <searchpath>

Additional search paths to use for configuration overrides

--verbose, -v

Turn on debug reporting.

validateButlerConfiguration.py

Validate the configuration files for a Gen3 Butler repository.

usage: validateButlerConfiguration.py [-h] [--collection COLLECTION] [--quiet]
                                      [--datasettype DATASETTYPE]
                                      [--ignore IGNORE]
                                      root

positional arguments

root

Filesystem path for an existing Butler repository.

optional arguments

-h, --help

show this help message and exit

--collection <collection>, -c <collection>

Collection to refer to in this repository.

--quiet, -q

Do not report individual failures.

--datasettype <datasettype>, -d <datasettype>

Specific DatasetType(s) to validate (can be comma-separated)

--ignore <ignore>, -i <ignore>

DatasetType(s) to ignore for validation (can be comma-separated)

Concrete Storage Classes

Python API reference

lsst.daf.butler Package

Functions

makeBoxWcsRegion(box, wcs, margin) Construct a spherical ConvexPolygon from a WCS and a bounding box.

Classes

Butler(config, str, None] = None, *, butler, …) Main entry point for the data access system.
ButlerConfig([other, searchPaths]) Contains the configuration for a Butler
ButlerURI(uri[, root, forceAbsolute]) Convenience wrapper around URI parsers.
ButlerValidationError There is a problem with the Butler configuration.
CompositeAssembler(storageClass) Class for providing assembler and disassembler support for composites.
CompositesConfig([other, validate, …])
CompositesMap(config, *, universe) Determine whether a specific datasetType or StorageClass should be disassembled.
Config([other]) Implements a datatype that is used by Butler for configuration parameters.
ConfigSubset([other, validate, …]) Config representing a subset of a more general configuration.
Constraints(config, *, universe) Determine whether a DatasetRef, DatasetType, or StorageClass is allowed to be handled.
ConstraintsConfig([other]) Configuration information for Constraints
ConstraintsValidationError Exception thrown when a constraints list has mutually exclusive definitions.
DataCoordinate An immutable data ID dictionary that guarantees that its key-value pairs identify all required dimensions in a DimensionGraph.
DatasetComponent(name, storageClass, component) Component of a dataset and associated information.
DatasetRef Reference to a Dataset in a Registry.
DatasetType(name, dimensions, storageClass, *) A named category of Datasets that defines how they are organized, related, and stored.
DatasetTypeNotSupportedError A DatasetType is not handled by this routine.
Datastore(config, registry[, butlerRoot]) Datastore interface.
DatastoreConfig([other, validate, …])
DatastoreValidationError There is a problem with the Datastore configuration.
DeferredDatasetHandle(butler, ref, parameters) Proxy class that provides deferred loading of a dataset from a butler.
Dimension(name, *, uniqueKeys, **kwds) A named data-organization concept that can be used as a key in a data ID.
DimensionConfig([other, validate, …]) Configuration that defines a DimensionUniverse.
DimensionElement(name, *, …) A named data-organization concept that defines a label and/or metadata in the dimensions system.
DimensionGraph An immutable, dependency-complete collection of dimensions.
DimensionPacker(fixed, dimensions) An abstract base class for bidirectional mappings between a DataCoordinate and a packed integer ID.
DimensionRecord(*args) Base class for the Python representation of database records for a DimensionElement.
DimensionUniverse A special DimensionGraph that constructs and manages a complete set of compatible dimensions.
ExpandedDataCoordinate A data ID that has been expanded to include all relevant metadata.
FileDataset(path, refs, List[DatasetRef]], …) A struct that represents a dataset exported to a file.
FileDescriptor(location, storageClass[, …]) Describes a particular file.
FileTemplate(template) Format a path template into a fully expanded path.
FileTemplateValidationError Exception thrown when a file template is not consistent with the associated DatasetType.
FileTemplates(config[, default]) Collection of FileTemplate templates.
FileTemplatesConfig([other]) Configuration information for FileTemplates
Formatter(fileDescriptor, dataId) Interface for reading and writing Datasets with a particular StorageClass.
FormatterFactory() Factory for Formatter instances.
Location(datastoreRootUri, path) Identifies a location within the Datastore.
LocationFactory(datastoreRoot) Factory for Location instances.
LookupKey([name, dimensions, dataId, universe]) Representation of key that can be used to lookup information based on dataset type name, storage class name, dimensions.
MappingFactory(refType) Register the mapping of some key to a python type and retrieve instances.
Quantum(*[, taskName, taskClass, dataId, …]) A discrete unit of work that may depend on one or more datasets and produces one or more datasets.
Registry(database, dimensions, *, opaque, create) Registry interface.
RepoExport(registry, datastore, backend, *, …) Public interface for exporting a subset of a data repository.
RepoExportBackend An abstract interface for data repository export implementations.
RepoImportBackend An abstract interface for data repository import implementations.
RepoTransferFormatConfig([other, validate, …]) The section of butler configuration that associates repo import/export backends with file formats.
SkyPixDimension(name, pixelization) A special Dimension subclass for hierarchical pixelizations of the sky.
StorageClass([name, pytype, components, …]) Class describing how a label maps to a particular Python type.
StorageClassConfig([other, validate, …])
StorageClassFactory([config]) Factory for StorageClass instances.
StoredDatastoreItemInfo Internal information associated with a stored dataset in a Datastore.
StoredFileInfo(formatter, path, …) Datastore-private metadata associated with a file stored in a Datastore.
Timespan
ValidationError Some sort of validation error has occurred.
YamlRepoExportBackend(stream) A repository export implementation that saves to a YAML file.
YamlRepoImportBackend(stream, registry) A repository import implementation that reads from a YAML file.

Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler._butler.Butler, lsst.daf.butler._butlerConfig.ButlerConfig, lsst.daf.butler.core.location.ButlerURI, lsst.daf.butler._butler.ButlerValidationError, lsst.daf.butler.core.assembler.CompositeAssembler, lsst.daf.butler.core.composites.CompositesConfig, lsst.daf.butler.core.composites.CompositesMap, lsst.daf.butler.core.config.Config, lsst.daf.butler.core.config.ConfigSubset, lsst.daf.butler.core.constraints.Constraints, lsst.daf.butler.core.constraints.ConstraintsConfig, lsst.daf.butler.core.constraints.ConstraintsValidationError, lsst.daf.butler.core.dimensions.coordinate.DataCoordinate, lsst.daf.butler.core.assembler.DatasetComponent, lsst.daf.butler.core.datasets.ref.DatasetRef, lsst.daf.butler.core.datasets.type.DatasetType, lsst.daf.butler.core.exceptions.DatasetTypeNotSupportedError, lsst.daf.butler.core.datastore.Datastore, lsst.daf.butler.core.datastore.DatastoreConfig, lsst.daf.butler.core.datastore.DatastoreValidationError, lsst.daf.butler._deferredDatasetHandle.DeferredDatasetHandle, lsst.daf.butler.core.dimensions.elements.Dimension, lsst.daf.butler.core.dimensions.config.DimensionConfig, lsst.daf.butler.core.dimensions.elements.DimensionElement, lsst.daf.butler.core.dimensions.graph.DimensionGraph, lsst.daf.butler.core.dimensions.packer.DimensionPacker, lsst.daf.butler.core.dimensions.records.DimensionRecord, lsst.daf.butler.core.dimensions.universe.DimensionUniverse, lsst.daf.butler.core.dimensions.coordinate.ExpandedDataCoordinate, lsst.daf.butler.core.repoTransfers.FileDataset, lsst.daf.butler.core.fileDescriptor.FileDescriptor, lsst.daf.butler.core.fileTemplates.FileTemplate, lsst.daf.butler.core.fileTemplates.FileTemplateValidationError, lsst.daf.butler.core.fileTemplates.FileTemplates, lsst.daf.butler.core.fileTemplates.FileTemplatesConfig, lsst.daf.butler.core.formatter.Formatter, lsst.daf.butler.core.formatter.FormatterFactory, lsst.daf.butler.core.location.Location, lsst.daf.butler.core.location.LocationFactory, lsst.daf.butler.core.configSupport.LookupKey, lsst.daf.butler.core.mappingFactory.MappingFactory, lsst.daf.butler.core.quantum.Quantum, lsst.daf.butler.registry._registry.Registry, lsst.daf.butler.core.repoTransfers.RepoExport, lsst.daf.butler.core.repoTransfers.RepoExportBackend, lsst.daf.butler.core.repoTransfers.RepoImportBackend, lsst.daf.butler.core.repoTransfers.RepoTransferFormatConfig, lsst.daf.butler.core.dimensions.elements.SkyPixDimension, lsst.daf.butler.core.storageClass.StorageClass, lsst.daf.butler.core.storageClass.StorageClassConfig, lsst.daf.butler.core.storageClass.StorageClassFactory, lsst.daf.butler.core.storedFileInfo.StoredDatastoreItemInfo, lsst.daf.butler.core.storedFileInfo.StoredFileInfo, lsst.daf.butler.core.timespan.Timespan, lsst.daf.butler.core.exceptions.ValidationError, lsst.daf.butler.core.repoTransfers.YamlRepoExportBackend, lsst.daf.butler.core.repoTransfers.YamlRepoImportBackend

lsst.daf.butler.registry Package

Classes

AmbiguousDatasetError Exception raised when a DatasetRef has no ID and a Registry operation requires one.
ConflictingDefinitionError Exception raised when trying to insert a database record when a conflicting record already exists.
DbAuth([path, envVar, authList]) Retrieves authentication information for database connections.
DbAuthError A problem has occurred retrieving database authentication information.
DbAuthPermissionsError Credentials file has incorrect permissions.
OrphanedRecordError Exception raised when trying to remove or modify a database record that is still being used in some other table.
Registry(database, dimensions, *, opaque, create) Registry interface.
RegistryConfig([other, validate, …])

Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry._registry.AmbiguousDatasetError, lsst.daf.butler.registry._registry.ConflictingDefinitionError, lsst.daf.butler.registry._dbAuth.DbAuth, lsst.daf.butler.registry._dbAuth.DbAuthError, lsst.daf.butler.registry._dbAuth.DbAuthPermissionsError, lsst.daf.butler.registry._registry.OrphanedRecordError, lsst.daf.butler.registry._registry.Registry, lsst.daf.butler.registry._config.RegistryConfig

lsst.daf.butler.registry.interfaces Package

Classes

Database(*, origin, connection, namespace) An abstract interface that represents a particular database engine’s representation of a single schema/namespace/database.
DatabaseConflictError Exception raised when database content (row values or schema entities) are inconsistent with what this client expects.
OpaqueTableStorage(name) An interface that manages the records associated with a particular opaque table in a Registry.
OpaqueTableStorageManager An interface that manages the opaque tables in a Registry.
ReadOnlyDatabaseError Exception raised when a write operation is called on a read-only Database.
StaticTablesContext(db) Helper class used to declare the static schema for a registry layer in a database.

Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry.interfaces._database.Database, lsst.daf.butler.registry.interfaces._database.DatabaseConflictError, lsst.daf.butler.registry.interfaces._opaque.OpaqueTableStorage, lsst.daf.butler.registry.interfaces._opaque.OpaqueTableStorageManager, lsst.daf.butler.registry.interfaces._database.ReadOnlyDatabaseError, lsst.daf.butler.registry.interfaces._database.StaticTablesContext

lsst.daf.butler.registry.queries Package

Classes

DatasetRegistryStorage(connection, universe, …) An object managing dataset and related tables in a Registry.
Like(pattern) Simple wrapper around a string pattern used to indicate that a string is a pattern to be used with the SQL LIKE operator rather than a complete name.
Query(*, connection, sql, summary, columns, …) A wrapper for a SQLAlchemy query that knows how to re-bind parameters and transform result rows into data IDs and dataset references.
QueryBuilder(connection, summary, …) A builder for potentially complex queries that join tables based on dimension relationships.
QuerySummary(requested, *, dataId, …) A struct that holds and categorizes the dimensions involved in a query.

Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry.queries._datasets.DatasetRegistryStorage, lsst.daf.butler.registry.queries._datasets.Like, lsst.daf.butler.registry.queries._query.Query, lsst.daf.butler.registry.queries._builder.QueryBuilder, lsst.daf.butler.registry.queries._structs.QuerySummary

Example datastores

lsst.daf.butler.datastores.posixDatastore Module

Classes
PosixDatastore(config, registry[, butlerRoot]) Basic POSIX filesystem backed Datastore.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.datastores.posixDatastore.PosixDatastore

lsst.daf.butler.datastores.inMemoryDatastore Module

Classes
StoredMemoryItemInfo(timestamp, …) Internal InMemoryDatastore Metadata associated with a stored DatasetRef.
InMemoryDatastore(config[, registry, butlerRoot]) Basic Datastore for writing to an in memory cache.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.datastores.inMemoryDatastore.StoredMemoryItemInfo, lsst.daf.butler.datastores.inMemoryDatastore.InMemoryDatastore

lsst.daf.butler.datastores.chainedDatastore Module

Classes
ChainedDatastore(config[, registry, butlerRoot]) Chained Datastores to allow read and writes from multiple datastores.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.datastores.chainedDatastore.ChainedDatastore

Example formatters

lsst.daf.butler.formatters.fileFormatter Module

Classes
FileFormatter(fileDescriptor, dataId) Interface for reading and writing files on a POSIX file system.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.formatters.fileFormatter.FileFormatter

lsst.daf.butler.formatters.jsonFormatter Module

Classes
JsonFormatter(fileDescriptor, dataId) Interface for reading and writing Python objects to and from JSON files.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.formatters.jsonFormatter.JsonFormatter

lsst.daf.butler.formatters.yamlFormatter Module

Classes
YamlFormatter(fileDescriptor, dataId) Interface for reading and writing Python objects to and from YAML files.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.formatters.yamlFormatter.YamlFormatter

lsst.daf.butler.formatters.pickleFormatter Module

Classes
PickleFormatter(fileDescriptor, dataId) Interface for reading and writing Python objects to and from pickle files.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.formatters.pickleFormatter.PickleFormatter

Database backends

lsst.daf.butler.registry.databases.sqlite Module

Classes
SqliteDatabase(*, connection, origin, …) An implementation of the Database interface for SQLite3.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry.databases.sqlite.SqliteDatabase

lsst.daf.butler.registry.databases.postgresql Module

Classes
PostgresqlDatabase(*, connection, origin, …) An implementation of the Database interface for PostgreSQL.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry.databases.postgresql.PostgresqlDatabase

lsst.daf.butler.registry.databases.oracle Module

Classes
OracleDatabase(*, connection, origin, …) An implementation of the Database interface for Oracle.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.registry.databases.oracle.OracleDatabase

Support API

lsst.daf.butler.core.safeFileIo Module

Functions
safeMakeDir(directory) Make a directory in a manner avoiding race conditions
setFileMode(filename) Set a file mode according to the user’s umask
FileForWriteOnceCompareSame(name) Context manager to get a file that can be written only once and all other writes will succeed only if they match the initial write.
SafeFile(name) Context manager to create a file in a manner avoiding race conditions
SafeFilename(name) Context manager for creating a file in a manner avoiding race conditions.
SafeLockedFileForRead(name) Context manager for reading a file that may be locked with an exclusive lock via SafeLockedFileForWrite.
Classes
DoNotWrite
FileForWriteOnceCompareSameFailure
SafeLockedFileForWrite(name) File-like object that is used to create a file if needed, lock it with an exclusive lock, and contain file descriptors to readable and writable versions of the file.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.core.safeFileIo.DoNotWrite, lsst.daf.butler.core.safeFileIo.FileForWriteOnceCompareSameFailure, lsst.daf.butler.core.safeFileIo.SafeLockedFileForWrite

lsst.daf.butler.core.utils Module

Functions
allSlots(self) Return combined __slots__ for all classes in objects mro.
getClassOf(typeOrName) Given the type name or a type, return the python type.
getFullTypeName(cls) Return full type name of the supplied entity.
getInstanceOf(typeOrName, *args, **kwargs) Given the type name or a type, instantiate an object of that type.
getObjectSize(obj[, seen]) Recursively finds size of objects.
immutable(cls) A class decorator that simulates a simple form of immutability for the decorated class.
iterable(a) Make input iterable.
slotValuesAreEqual(self, other) Test for equality by the contents of all slots, including those of its parents.
slotValuesToHash(self) Generate a hash from slot values.
stripIfNotNone(s) Strip leading and trailing whitespace if the given object is not None.
transactional(func) Decorator that wraps a method and makes it transactional.
Classes
IndexedTupleDict An immutable mapping that combines a tuple of values with a (possibly shared) mapping from key to tuple index.
NamedKeyDict(*args) A dictionary wrapper that require keys to have a .name attribute, and permits lookups using either key objects or their names.
NamedValueSet(elements) A custom mutable set class that requires elements to have a .name attribute, which can then be used as keys in dict-like lookup.
PrivateConstructorMeta A metaclass that disables regular construction syntax.
Singleton Metaclass to convert a class to a Singleton.
Class Inheritance Diagram

Inheritance diagram of lsst.daf.butler.core.utils.IndexedTupleDict, lsst.daf.butler.core.utils.NamedKeyDict, lsst.daf.butler.core.utils.NamedValueSet, lsst.daf.butler.core.utils.PrivateConstructorMeta, lsst.daf.butler.core.utils.Singleton

lsst.daf.butler.core.repoRelocation Module

Functions
replaceRoot(configRoot, butlerRoot) Update a configuration root with the butler root location.
Variables
BUTLER_ROOT_TAG The special string to be used in configuration files to indicate that the butler root location should be used.

Test utilities

lsst.daf.butler.tests Package

Functions

addDatasetType(butler, name, dimensions, …) Add a new dataset type to a repository.
expandUniqueId(butler, partialId) Return a complete data ID matching some criterion.
makeTestCollection(repo) Create a read/write Butler to a fresh collection.
makeTestRepo(root, dataIds, *[, config]) Create an empty repository with dummy data IDs.
registerMetricsExample(butler) Modify a repository to support reading and writing MetricsExample objects.

Classes

BadNoWriteFormatter(fileDescriptor, dataId) A formatter that always fails without writing anything.
BadWriteFormatter(fileDescriptor, dataId) A formatter that never works but does leave a file behind.
DatasetTestHelper Helper methods for Datasets
DatastoreTestHelper Helper methods for Datastore tests
DummyRegistry() Dummy Registry, for Datastore test purposes.
FitsCatalogDatasetsHelper
ListAssembler(storageClass) Parameter handler for list parameters
MetricsAssembler(storageClass) Parameter handler for parameters using Metrics
MetricsExample([summary, output, data]) Smorgasboard of information that might be the result of some processing.
MultiDetectorFormatter(fileDescriptor, dataId)