ApdbCassandra¶
- class lsst.dax.apdb.ApdbCassandra(config: ApdbCassandraConfig)¶
Bases:
ApdbImplementation of APDB database on to of Apache Cassandra.
The implementation is configured via standard
pex_configmechanism usingApdbCassandraConfigconfiguration class. For an example of different configurations check config/ folder.- Parameters:
- config
ApdbCassandraConfig Configuration object.
- config
Attributes Summary
Object controlling access to APDB metadata (
ApdbMetadata).Name of the metadata key to store code version number.
Name of the metadata key to store schema version number.
Start time for partition 0, this should never be changed.
Methods Summary
Return version number for current APDB implementation.
Return schema version number as defined in config file.
containsVisitDetector(visit, detector)Test whether data for a given visit-detector is present in the APDB.
Return the number of DiaObjects that have only one DiaSource associated with them.
dailyJob()Implement daily activities like cleanup/vacuum.
deleteInsertIds(ids)Remove insert identifiers from the database.
getDiaForcedSources(region, object_ids, ...)Return catalog of DiaForcedSource instances from a given region.
Return catalog of DiaForcedSource instances from a given time period.
getDiaObjects(region)Return catalog of DiaObject instances from a given region.
getDiaObjectsHistory(ids)Return catalog of DiaObject instances from a given time period including the history of each DiaObject.
getDiaSources(region, object_ids, visit_time)Return catalog of DiaSource instances from a given region.
getDiaSourcesHistory(ids)Return catalog of DiaSource instances from a given time period.
Return collection of insert identifiers known to the database.
Return catalog of SSObject instances.
makeField(doc)Make a
ConfigurableFieldfor Apdb.makeSchema([drop])Create or re-create whole database schema.
reassignDiaSources(idMap)Associate DiaSources with SSObjects, dis-associating them from DiaObjects.
store(visit_time, objects[, sources, ...])Store all three types of catalogs in the database.
storeSSObjects(objects)Store or update SSObject catalog.
tableDef(table)Return table schema definition for a given table.
Attributes Documentation
- metadata¶
- metadataCodeVersionKey = 'version:ApdbCassandra'¶
Name of the metadata key to store code version number.
- metadataSchemaVersionKey = 'version:schema'¶
Name of the metadata key to store schema version number.
- partition_zero_epoch = DateTime("1970-01-01T00:00:00.000000000", TAI)¶
Start time for partition 0, this should never be changed.
Methods Documentation
- classmethod apdbImplementationVersion() VersionTuple¶
Return version number for current APDB implementation.
- Returns:
- version
VersionTuple Version of the code defined in implementation class.
- version
- apdbSchemaVersion() VersionTuple¶
Return schema version number as defined in config file.
- Returns:
- version
VersionTuple Version of the schema defined in schema config file.
- version
- containsVisitDetector(visit: int, detector: int) bool¶
Test whether data for a given visit-detector is present in the APDB.
- countUnassociatedObjects() int¶
Return the number of DiaObjects that have only one DiaSource associated with them.
Used as part of ap_verify metrics.
- Returns:
- count
int Number of DiaObjects with exactly one associated DiaSource.
- count
Notes
This method can be very inefficient or slow in some implementations.
- dailyJob() None¶
Implement daily activities like cleanup/vacuum.
What should be done during daily activities is determined by specific implementation.
- deleteInsertIds(ids: Iterable[ApdbInsertId]) None¶
Remove insert identifiers from the database.
- Parameters:
- ids
iterable[ApdbInsertId] Insert identifiers, can include items returned from
getInsertIds.
- ids
Notes
This method causes Apdb to forget about specified identifiers. If there are any auxiliary data associated with the identifiers, it is also removed from database (but data in regular tables is not removed). This method should be called after successful transfer of data from APDB to PPDB to free space used by history.
- getDiaForcedSources(region: Region, object_ids: Iterable[int] | None, visit_time: DateTime) DataFrame | None¶
Return catalog of DiaForcedSource instances from a given region.
- Parameters:
- region
lsst.sphgeom.Region Region to search for DIASources.
- object_idsiterable [
int], optional List of DiaObject IDs to further constrain the set of returned sources. If list is empty then empty catalog is returned with a correct schema. If
Nonethen returned sources are not constrained. Some implementations may not support latter case.- visit_time
lsst.daf.base.DateTime Time of the current visit.
- region
- Returns:
- catalog
pandas.DataFrame, orNone Catalog containing DiaSource records.
Noneis returned ifread_forced_sources_monthsconfiguration parameter is set to 0.
- catalog
- Raises:
- NotImplementedError
May be raised by some implementations if
object_idsisNone.
Notes
This method returns DiaForcedSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by
read_forced_sources_monthsconfig parameter, w.r.t.visit_time. Ifobject_idsis empty then an empty catalog is always returned with the correct schema (columns/types). Ifobject_idsisNonethen no filtering is performed and some of the returned records may be outside the specified region.
- getDiaForcedSourcesHistory(ids: Iterable[ApdbInsertId]) ApdbTableData¶
Return catalog of DiaForcedSource instances from a given time period.
- Parameters:
- ids
iterable[ApdbInsertId] Insert identifiers, can include items returned from
getInsertIds.
- ids
- Returns:
- data
ApdbTableData Catalog containing DiaForcedSource records. In addition to all regular columns it will contain
insert_idcolumn.
- data
Notes
This part of API may not be very stable and can change before the implementation finalizes.
- getDiaObjects(region: Region) DataFrame¶
Return catalog of DiaObject instances from a given region.
This method returns only the last version of each DiaObject. Some records in a returned catalog may be outside the specified region, it is up to a client to ignore those records or cleanup the catalog before futher use.
- Parameters:
- region
lsst.sphgeom.Region Region to search for DIAObjects.
- region
- Returns:
- catalog
pandas.DataFrame Catalog containing DiaObject records for a region that may be a superset of the specified region.
- catalog
- getDiaObjectsHistory(ids: Iterable[ApdbInsertId]) ApdbTableData¶
Return catalog of DiaObject instances from a given time period including the history of each DiaObject.
- Parameters:
- ids
iterable[ApdbInsertId] Insert identifiers, can include items returned from
getInsertIds.
- ids
- Returns:
- data
ApdbTableData Catalog containing DiaObject records. In addition to all regular columns it will contain
insert_idcolumn.
- data
Notes
This part of API may not be very stable and can change before the implementation finalizes.
- getDiaSources(region: Region, object_ids: Iterable[int] | None, visit_time: DateTime) DataFrame | None¶
Return catalog of DiaSource instances from a given region.
- Parameters:
- region
lsst.sphgeom.Region Region to search for DIASources.
- object_idsiterable [
int], optional List of DiaObject IDs to further constrain the set of returned sources. If
Nonethen returned sources are not constrained. If list is empty then empty catalog is returned with a correct schema.- visit_time
lsst.daf.base.DateTime Time of the current visit.
- region
- Returns:
- catalog
pandas.DataFrame, orNone Catalog containing DiaSource records.
Noneis returned ifread_sources_monthsconfiguration parameter is set to 0.
- catalog
Notes
This method returns DiaSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by
read_sources_monthsconfig parameter, w.r.t.visit_time. Ifobject_idsis empty then an empty catalog is always returned with the correct schema (columns/types). Ifobject_idsisNonethen no filtering is performed and some of the returned records may be outside the specified region.
- getDiaSourcesHistory(ids: Iterable[ApdbInsertId]) ApdbTableData¶
Return catalog of DiaSource instances from a given time period.
- Parameters:
- ids
iterable[ApdbInsertId] Insert identifiers, can include items returned from
getInsertIds.
- ids
- Returns:
- data
ApdbTableData Catalog containing DiaSource records. In addition to all regular columns it will contain
insert_idcolumn.
- data
Notes
This part of API may not be very stable and can change before the implementation finalizes.
- getInsertIds() list[lsst.dax.apdb.apdb.ApdbInsertId] | None¶
Return collection of insert identifiers known to the database.
- Returns:
- ids
list[ApdbInsertId] orNone List of identifiers, they may be time-ordered if database supports ordering.
Noneis returned if database is not configured to store insert identifiers.
- ids
- getSSObjects() DataFrame¶
Return catalog of SSObject instances.
- Returns:
- catalog
pandas.DataFrame Catalog containing SSObject records, all existing records are returned.
- catalog
- classmethod makeField(doc: str) ConfigurableField¶
Make a
ConfigurableFieldfor Apdb.- Parameters:
- doc
str Help text for the field.
- doc
- Returns:
- configurableField
lsst.pex.config.ConfigurableField A
ConfigurableFieldfor Apdb.
- configurableField
- makeSchema(drop: bool = False) None¶
Create or re-create whole database schema.
- Parameters:
- drop
bool If True then drop all tables before creating new ones.
- drop
- reassignDiaSources(idMap: Mapping[int, int]) None¶
Associate DiaSources with SSObjects, dis-associating them from DiaObjects.
- Parameters:
- idMap
Mapping Maps DiaSource IDs to their new SSObject IDs.
- idMap
- Raises:
- ValueError
Raised if DiaSource ID does not exist in the database.
- store(visit_time: DateTime, objects: DataFrame, sources: DataFrame | None = None, forced_sources: DataFrame | None = None) None¶
Store all three types of catalogs in the database.
- Parameters:
- visit_time
lsst.daf.base.DateTime Time of the visit.
- objects
pandas.DataFrame Catalog with DiaObject records.
- sources
pandas.DataFrame, optional Catalog with DiaSource records.
- forced_sources
pandas.DataFrame, optional Catalog with DiaForcedSource records.
- visit_time
Notes
This methods takes DataFrame catalogs, their schema must be compatible with the schema of APDB table:
column names must correspond to database table columns
types and units of the columns must match database definitions, no unit conversion is performed presently
columns that have default values in database schema can be omitted from catalog
this method knows how to fill interval-related columns of DiaObject (validityStart, validityEnd) they do not need to appear in a catalog
source catalogs have
diaObjectIdcolumn associating sources with objects
- storeSSObjects(objects: DataFrame) None¶
Store or update SSObject catalog.
- Parameters:
- objects
pandas.DataFrame Catalog with SSObject records.
- objects
Notes
If SSObjects with matching IDs already exist in the database, their records will be updated with the information from provided records.
- tableDef(table: ApdbTables) Table | None¶
Return table schema definition for a given table.
- Parameters:
- table
ApdbTables One of the known APDB tables.
- table
- Returns: