Database¶
-
class
lsst.daf.butler.registry.interfaces.
Database
(*, origin: int, connection: sqlalchemy.engine.base.Connection, namespace: Optional[str] = None)¶ Bases:
abc.ABC
An abstract interface that represents a particular database engine’s representation of a single schema/namespace/database.
Parameters: - origin :
int
An integer ID that should be used as the default for any datasets, quanta, or other entities that use a (autoincrement, origin) compound primary key.
- connection :
sqlalchemy.engine.Connection
The SQLAlchemy connection this
Database
wraps.- namespace :
str
, optional Name of the schema or namespace this instance is associated with. This is passed as the
schema
argument when constructing asqlalchemy.schema.MetaData
instance. We usenamespace
instead to avoid confusion between “schema means namespace” and “schema means table definitions”.
Notes
Database
requires all write operations to go through its special named methods. Our write patterns are sufficiently simple that we don’t really need the full flexibility of SQL insert/update/delete syntax, and we need non-standard (but common) functionality in these operations sufficiently often that it seems worthwhile to provide our own generic API.In contrast,
Database.query
allows arbitrarySELECT
queries (via their SQLAlchemy representation) to be run, as we expect these to require significantly more sophistication while still being limited to standard SQL.Database
itself has several underscore-prefixed attributes:_connection
: SQLAlchemy object representing the connection._metadata
: thesqlalchemy.schema.MetaData
object representing- the tables and other schema entities.
These are considered protected (derived classes may access them, but other code should not), and read-only, aside from executing SQL via
_connection
.Attributes Summary
dialect
The SQLAlchemy dialect for this database engine ( sqlalchemy.engine.Dialect
).Methods Summary
assertTableWriteable
(table, msg)Raise if the given table is not writeable, either because the database connection is read-write or the table is a temporary table. connect
(uri, *, writeable)Create a sqlalchemy.engine.Connection
from a SQLAlchemy URI.declareStaticTables
(*, create)Return a context manager in which the database’s static DDL schema can be declared. delete
(table, columns, *rows)Delete one or more rows from a table. dropTemporaryTable
(table)Drop a temporary table. ensure
(table, *rows)Insert one or more rows into a table, skipping any rows for which insertion would violate any constraint. ensureTableExists
(name, spec)Ensure that a table with the given name and specification exists, creating it if necessary. expandDatabaseEntityName
(shrunk)Retrieve the original name for a database entity that was too long to fit within the database engine’s limits. fromConnection
(connection, *, origin, …)Create a new Database
from an existingsqlalchemy.engine.Connection
.fromUri
(uri, *, origin, namespace, writeable)Construct a database from a SQLAlchemy URI. getExistingTable
(name, spec)Obtain an existing table with the given name and specification. getTimespanRepresentation
()Return a type
that encapsulates the wayTimespan
objects are recommended to be stored in this database.insert
(table, *rows, returnIds, select, names)Insert one or more rows into a table, optionally returning autoincrement primary key values. isTableWriteable
(table)Check whether a table is writeable, either because the database connection is read-write or the table is a temporary table. isWriteable
()Return True
if this database can be modified by this client.makeDefaultUri
(root)Create a default connection URI appropriate for the given root directory, or None
if there can be no such default.makeTemporaryTable
(spec, name)Create a temporary table. query
(sql, *args, **kwds)Run a SELECT query against the database. replace
(table, *rows)Insert one or more rows into a table, replacing any existing rows for which insertion of a new row would violate the primary key constraint. shrinkDatabaseEntityName
(original)Return a version of the given name that fits within this database engine’s length limits for table, constraint, indexes, and sequence names. sync
(table, *, keys, Any], compared, …)Insert into a table as necessary to ensure database contains values equivalent to the given ones. transaction
(*, interrupting, savepoint, lock)Return a context manager that represents a transaction. update
(table, where, str], *rows)Update one or more rows in a table. Attributes Documentation
-
dialect
¶ The SQLAlchemy dialect for this database engine (
sqlalchemy.engine.Dialect
).
Methods Documentation
-
assertTableWriteable
(table: sqlalchemy.sql.schema.Table, msg: str) → None¶ Raise if the given table is not writeable, either because the database connection is read-write or the table is a temporary table.
Parameters: - table :
sqlalchemy.schema.Table
SQLAlchemy table object to check.
- msg :
str
, optional If provided, raise
ReadOnlyDatabaseError
instead of returningFalse
, with this message.
- table :
-
classmethod
connect
(uri: str, *, writeable: bool = True) → sqlalchemy.engine.base.Connection¶ Create a
sqlalchemy.engine.Connection
from a SQLAlchemy URI.Parameters: - uri :
str
A SQLAlchemy URI connection string.
- origin :
int
An integer ID that should be used as the default for any datasets, quanta, or other entities that use a (autoincrement, origin) compound primary key.
- writeable :
bool
, optional If
True
, allow write operations on the database, includingCREATE TABLE
.
Returns: - connection :
sqlalchemy.engine.Connection
A database connection.
Notes
Subclasses that support other ways to connect to a database are encouraged to add optional arguments to their implementation of this method, as long as they maintain compatibility with the base class call signature.
- uri :
-
declareStaticTables
(*, create: bool) → Iterator[lsst.daf.butler.registry.interfaces._database.StaticTablesContext]¶ Return a context manager in which the database’s static DDL schema can be declared.
Parameters: Returns: - schema :
StaticTablesContext
A helper object that is used to add new tables.
Raises: - ReadOnlyDatabaseError
Raised if
create
isTrue
,Database.isWriteable
isFalse
, and one or more declared tables do not already exist.
Notes
A database’s static DDL schema must be declared before any dynamic tables are managed via calls to
ensureTableExists
orgetExistingTable
. The order in which static schema tables are added inside the context block is unimportant; they will automatically be sorted and added in an order consistent with their foreign key relationships.Examples
Given a
Database
instancedb
:with db.declareStaticTables(create=True) as schema: schema.addTable("table1", TableSpec(...)) schema.addTable("table2", TableSpec(...))
- schema :
-
delete
(table: sqlalchemy.sql.schema.Table, columns: Iterable[str], *rows) → int¶ Delete one or more rows from a table.
Parameters: - table :
sqlalchemy.schema.Table
Table that rows should be deleted from.
- columns: `~collections.abc.Iterable` of `str`
The names of columns that will be used to constrain the rows to be deleted; these will be combined via
AND
to form theWHERE
clause of the delete query.- *rows
Positional arguments are the keys of rows to be deleted, as dictionaries mapping column name to value. The keys in all dictionaries must exactly the names in
columns
.
Returns: - count :
int
Number of rows deleted.
Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
when this method is called.
Notes
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
The default implementation should be sufficient for most derived classes.
- table :
-
dropTemporaryTable
(table: sqlalchemy.sql.schema.Table) → None¶ Drop a temporary table.
Parameters: - table :
sqlalchemy.schema.Table
A SQLAlchemy object returned by a previous call to
makeTemporaryTable
.
- table :
-
ensure
(table: sqlalchemy.sql.schema.Table, *rows) → int¶ Insert one or more rows into a table, skipping any rows for which insertion would violate any constraint.
Parameters: - table :
sqlalchemy.schema.Table
Table rows should be inserted into.
- *rows
Positional arguments are the rows to be inserted, as dictionaries mapping column name to value. The keys in all dictionaries must be the same.
Returns: - count :
int
The number of rows actually inserted.
Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
when this method is called. This is raised even if the operation would do nothing even on a writeable database.
Notes
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
Implementations are not required to support
ensure
on tables with autoincrement keys.- table :
-
ensureTableExists
(name: str, spec: lsst.daf.butler.core.ddl.TableSpec) → sqlalchemy.sql.schema.Table¶ Ensure that a table with the given name and specification exists, creating it if necessary.
Parameters: - name :
str
Name of the table (not including namespace qualifiers).
- spec :
TableSpec
Specification for the table. This will be used when creating the table, and may be used when obtaining an existing table to check for consistency, but no such check is guaranteed.
Returns: - table :
sqlalchemy.schema.Table
SQLAlchemy representation of the table.
Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
, and the table does not already exist.- DatabaseConflictError
Raised if the table exists but
spec
is inconsistent with its definition.
Notes
This method may not be called within transactions. It may be called on read-only databases if and only if the table does in fact already exist.
Subclasses may override this method, but usually should not need to.
- name :
-
expandDatabaseEntityName
(shrunk: str) → str¶ Retrieve the original name for a database entity that was too long to fit within the database engine’s limits.
Parameters: - original :
str
The original name.
Returns: - shrunk :
str
The new, possibly shortened name.
- original :
-
classmethod
fromConnection
(connection: sqlalchemy.engine.base.Connection, *, origin: int, namespace: Optional[str] = None, writeable: bool = True) → lsst.daf.butler.registry.interfaces._database.Database¶ Create a new
Database
from an existingsqlalchemy.engine.Connection
.Parameters: - connection :
sqllachemy.engine.Connection
The connection for the the database. May be shared between
Database
instances.- origin :
int
An integer ID that should be used as the default for any datasets, quanta, or other entities that use a (autoincrement, origin) compound primary key.
- namespace :
str
, optional A different database namespace (i.e. schema) the new instance should be associated with. If
None
(default), the namespace (if any) is inferred from the connection.- writeable :
bool
, optional If
True
, allow write operations on the database, includingCREATE TABLE
.
Returns: Notes
This method allows different
Database
instances to share the same connection, which is desirable when they represent different namespaces can be queried together. This also ties their transaction state, however; starting a transaction in any database automatically starts on in all other databases.- connection :
-
classmethod
fromUri
(uri: str, *, origin: int, namespace: Optional[str] = None, writeable: bool = True) → lsst.daf.butler.registry.interfaces._database.Database¶ Construct a database from a SQLAlchemy URI.
Parameters: - uri :
str
A SQLAlchemy URI connection string.
- origin :
int
An integer ID that should be used as the default for any datasets, quanta, or other entities that use a (autoincrement, origin) compound primary key.
- namespace :
str
, optional A database namespace (i.e. schema) the new instance should be associated with. If
None
(default), the namespace (if any) is inferred from the URI.- writeable :
bool
, optional If
True
, allow write operations on the database, includingCREATE TABLE
.
Returns: - uri :
-
getExistingTable
(name: str, spec: lsst.daf.butler.core.ddl.TableSpec) → Optional[sqlalchemy.sql.schema.Table]¶ Obtain an existing table with the given name and specification.
Parameters: - name :
str
Name of the table (not including namespace qualifiers).
- spec :
TableSpec
Specification for the table. This will be used when creating the SQLAlchemy representation of the table, and it is used to check that the actual table in the database is consistent.
Returns: Raises: - DatabaseConflictError
Raised if the table exists but
spec
is inconsistent with its definition.
Notes
This method can be called within transactions and never modifies the database.
Subclasses may override this method, but usually should not need to.
- name :
-
classmethod
getTimespanRepresentation
() → Type[lsst.daf.butler.core.timespan.DatabaseTimespanRepresentation]¶ Return a
type
that encapsulates the wayTimespan
objects are recommended to be stored in this database.Database
does not automatically use the return type of this method anywhere else; calling code is responsible for making sure that DDL and queries are consistent with it.Returns: - tsRepr :
type
(DatabaseTimespanRepresention
subclass) A type that encapsultes the way
Timespan
objects should be stored in this database.
Notes
There are two big reasons we’ve decided to keep timespan-mangling logic outside the
Database
implementations, even though the choice of representation is ultimately up to aDatabase
implementation:- Timespans appear in relatively few tables and queries in our
typical usage, and the code that operates on them is already aware
that it is working with timespans. In contrast, a
timespan-representation-aware implementation of, say,
insert
, would need to have extra logic to identify when timespan-mangling needed to occur, which would usually be useless overhead. - SQLAlchemy’s rich SELECT query expression system has no way to wrap multiple columns in a single expression object (the ORM does, but we are not using the ORM). So we would have to wrap _much_ more of that code in our own interfaces to encapsulate timespan representations there.
- tsRepr :
-
insert
(table: sqlalchemy.sql.schema.Table, *rows, returnIds: bool = False, select: Optional[sqlalchemy.sql.selectable.Select] = None, names: Optional[Iterable[str]] = None) → Optional[List[int]]¶ Insert one or more rows into a table, optionally returning autoincrement primary key values.
Parameters: - table :
sqlalchemy.schema.Table
Table rows should be inserted into.
- returnIds: `bool`
If
True
(False
is default), return the values of the table’s autoincrement primary key field (which much exist).- select :
sqlalchemy.sql.Select
, optional A SELECT query expression to insert rows from. Cannot be provided with either
rows
orreturnIds=True
.- names :
Iterable
[str
], optional Names of columns in
table
to be populated, ordered to match the columns returned byselect
. Ignored ifselect
isNone
. If not provided, the columns returned byselect
must be named to match the desired columns oftable
.- *rows
Positional arguments are the rows to be inserted, as dictionaries mapping column name to value. The keys in all dictionaries must be the same.
Returns: Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
when this method is called.
Notes
The default implementation uses bulk insert syntax when
returnIds
isFalse
, and a loop over single-row insert operations when it isTrue
.Derived classes should reimplement when they can provide a more efficient implementation (especially for the latter case).
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
- table :
-
isTableWriteable
(table: sqlalchemy.sql.schema.Table) → bool¶ Check whether a table is writeable, either because the database connection is read-write or the table is a temporary table.
Parameters: - table :
sqlalchemy.schema.Table
SQLAlchemy table object to check.
Returns: - writeable :
bool
Whether this table is writeable.
- table :
-
classmethod
makeDefaultUri
(root: str) → Optional[str]¶ Create a default connection URI appropriate for the given root directory, or
None
if there can be no such default.
-
makeTemporaryTable
(spec: lsst.daf.butler.core.ddl.TableSpec, name: Optional[str] = None) → sqlalchemy.sql.schema.Table¶ Create a temporary table.
Parameters: - spec :
TableSpec
Specification for the table.
- name :
str
, optional A unique (within this session/connetion) name for the table. Subclasses may override to modify the actual name used. If not provided, a unique name will be generated.
Returns: - table :
sqlalchemy.schema.Table
SQLAlchemy representation of the table.
Notes
Temporary tables may be created, dropped, and written to even in read-only databases - at least according to the Python-level protections in the
Database
classes. Server permissions may say otherwise, but in that case they probably need to be modified to support the full range of expected read-only butler behavior.Temporary table rows are guaranteed to be dropped when a connection is closed.
Database
implementations are permitted to allow the table to remain as long as this is transparent to the user (i.e. “creating” the temporary table in a new session should not be an error, even if it does nothing).It may not be possible to use temporary tables within transactions with some database engines (or configurations thereof).
- spec :
-
query
(sql: sqlalchemy.sql.selectable.FromClause, *args, **kwds) → sqlalchemy.engine.result.ResultProxy¶ Run a SELECT query against the database.
Parameters: - sql :
sqlalchemy.sql.FromClause
A SQLAlchemy representation of a
SELECT
query.- *args
Additional positional arguments are forwarded to
sqlalchemy.engine.Connection.execute
.- **kwds
Additional keyword arguments are forwarded to
sqlalchemy.engine.Connection.execute
.
Returns: - result :
sqlalchemy.engine.ResultProxy
Query results.
Notes
The default implementation should be sufficient for most derived classes.
- sql :
-
replace
(table: sqlalchemy.sql.schema.Table, *rows) → None¶ Insert one or more rows into a table, replacing any existing rows for which insertion of a new row would violate the primary key constraint.
Parameters: - table :
sqlalchemy.schema.Table
Table rows should be inserted into.
- *rows
Positional arguments are the rows to be inserted, as dictionaries mapping column name to value. The keys in all dictionaries must be the same.
Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
when this method is called.
Notes
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
Implementations should raise a
sqlalchemy.exc.IntegrityError
exception when a constraint other than the primary key would be violated.Implementations are not required to support
replace
on tables with autoincrement keys.- table :
-
shrinkDatabaseEntityName
(original: str) → str¶ Return a version of the given name that fits within this database engine’s length limits for table, constraint, indexes, and sequence names.
Implementations should not assume that simple truncation is safe, because multiple long names often begin with the same prefix.
The default implementation simply returns the given name.
Parameters: - original :
str
The original name.
Returns: - shrunk :
str
The new, possibly shortened name.
- original :
-
sync
(table: sqlalchemy.sql.schema.Table, *, keys: Dict[str, Any], compared: Optional[Dict[str, Any]] = None, extra: Optional[Dict[str, Any]] = None, returning: Optional[Sequence[str]] = None) → Tuple[Optional[Dict[str, Any]], bool]¶ Insert into a table as necessary to ensure database contains values equivalent to the given ones.
Parameters: - table :
sqlalchemy.schema.Table
Table to be queried and possibly inserted into.
- keys :
dict
Column name-value pairs used to search for an existing row; must be a combination that can be used to select a single row if one exists. If such a row does not exist, these values are used in the insert.
- compared :
dict
, optional Column name-value pairs that are compared to those in any existing row. If such a row does not exist, these rows are used in the insert.
- extra :
dict
, optional Column name-value pairs that are ignored if a matching row exists, but used in an insert if one is necessary.
- returning :
Sequence
ofstr
, optional The names of columns whose values should be returned.
Returns: Raises: - DatabaseConflictError
Raised if the values in
compared
do not match the values in the database.- ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
, and no matching record already exists.
Notes
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
It may be called on read-only databases if and only if the matching row does in fact already exist.
- table :
-
transaction
(*, interrupting: bool = False, savepoint: bool = False, lock: Iterable[sqlalchemy.sql.schema.Table] = ()) → Iterator[T_co]¶ Return a context manager that represents a transaction.
Parameters: - interrupting :
bool
, optional If
True
(False
is default), this transaction block may not be nested without an outer one, and attempting to do so is a logic (i.e. assertion) error.- savepoint :
bool
, optional If
True
(False
is default), create aSAVEPOINT
, allowing exceptions raised by the database (e.g. due to constraint violations) during this transaction’s context to be caught outside it without also rolling back all operations in an outer transaction block. IfFalse
, transactions may still be nested, but a rollback may be generated at any level and affects all levels, and commits are deferred until the outermost block completes. If any outer transaction block was created withsavepoint=True
, all inner blocks will be as well (regardless of the actual value passed). This has no effect if this is the outermost transaction.- lock :
Iterable
[sqlalchemy.schema.Table
], optional A list of tables to lock for the duration of this transaction. These locks are guaranteed to prevent concurrent writes, but only prevent concurrent reads if the database engine requires that in order to block concurrent writes.
Notes
All transactions on a connection managed by one or more
Database
instances _must_ go through this method, or transaction state will not be correctly managed.- interrupting :
-
update
(table: sqlalchemy.sql.schema.Table, where: Dict[str, str], *rows) → int¶ Update one or more rows in a table.
Parameters: - table :
sqlalchemy.schema.Table
Table containing the rows to be updated.
- where :
dict
[str
,str
] A mapping from the names of columns that will be used to search for existing rows to the keys that will hold these values in the
rows
dictionaries. Note that these may not be the same due to SQLAlchemy limitations.- *rows
Positional arguments are the rows to be updated. The keys in all dictionaries must be the same, and may correspond to either a value in the
where
dictionary or the name of a column to be updated.
Returns: - count :
int
Number of rows matched (regardless of whether the update actually modified them).
Raises: - ReadOnlyDatabaseError
Raised if
isWriteable
returnsFalse
when this method is called.
Notes
May be used inside transaction contexts, so implementations may not perform operations that interrupt transactions.
The default implementation should be sufficient for most derived classes.
- table :
- origin :