Apdb

class lsst.dax.apdb.Apdb

Bases: ABC

Abstract interface for APDB.

Attributes Summary

admin

Object providing adminitrative interface for APDB (ApdbAdmin).

metadata

Object controlling access to APDB metadata (ApdbMetadata).

schema

APDB table schema from sdm_schemas (ApdbSchema).

Methods Summary

containsVisitDetector(visit, detector, ...)

Test whether any sources for a given visit-detector are present in the APDB.

countUnassociatedObjects()

Return the number of DiaObjects that have only one DiaSource associated with them.

from_config(config)

Create Ppdb instance from configuration object.

from_uri(uri)

Make Apdb instance from a serialized configuration.

getConfig()

Return APDB configuration for this instance, including any updates that may be read from database.

getDiaForcedSources(region, object_ids, ...)

Return catalog of DiaForcedSource instances from a given region.

getDiaObjects(region)

Return catalog of DiaObject instances from a given region.

getDiaObjectsForDedup([since])

Return catalog of DiaObject stored in APDB since specified time.

getDiaSources(region, object_ids, visit_time)

Return catalog of DiaSource instances from a given region.

getDiaSourcesForDiaObjects(objects, start_time)

Return catalog of DiaSources associated with given DiaObjects.

reassignDiaSources(idMap)

Associate DiaSources with SSObjects, dis-associating them from DiaObjects.

reassignDiaSourcesToDiaObjects(idMap, *[, ...])

Re-assign DiaSources from one DiaObject to another, typically during deduplication.

resetDedup([dedup_time])

Delete deduplication-related data and remember deduplication time.

setValidityEnd(objects, validityEnd[, ...])

Close validity interval for specified DiaObjects.

store(visit_time, objects[, sources, ...])

Store all three types of catalogs in the database.

tableDef(table)

Return table schema definition for a given table.

Attributes Documentation

admin

Object providing adminitrative interface for APDB (ApdbAdmin).

metadata

Object controlling access to APDB metadata (ApdbMetadata).

schema

APDB table schema from sdm_schemas (ApdbSchema).

Methods Documentation

abstract containsVisitDetector(visit: int, detector: int, region: Region, visit_time: Time) bool

Test whether any sources for a given visit-detector are present in the APDB.

Parameters:
visit, detectorint

The ID of the visit-detector to search for.

regionlsst.sphgeom.Region

Region corresponding to the visit/detector combination.

visit_timeastropy.time.Time

Visit time (as opposed to visit processing time). This can be any timestamp in the visit timespan, e.g. its begin or end time.

Returns:
presentbool

True if at least one DiaSource or DiaForcedSource record may exist for the specified observation, False otherwise.

abstract countUnassociatedObjects() int

Return the number of DiaObjects that have only one DiaSource associated with them.

Used as part of ap_verify metrics.

Returns:
countint

Number of DiaObjects with exactly one associated DiaSource.

Notes

This method can be very inefficient or slow in some implementations.

classmethod from_config(config: ApdbConfig) Apdb

Create Ppdb instance from configuration object.

Parameters:
configApdbConfig

Configuration object, type of this object determines type of the Apdb implementation.

Returns:
apdbapdb

Instance of Apdb class.

classmethod from_uri(uri: str | ParseResult | ResourcePath | Path) Apdb

Make Apdb instance from a serialized configuration.

Parameters:
uriResourcePathExpression

URI or local file path pointing to a file with serialized configuration, or a string with a “label:” prefix. In the latter case, the configuration will be looked up from an APDB index file using the label name that follows the prefix. The APDB index file’s location is determined by the DAX_APDB_INDEX_URI environment variable.

Returns:
apdbapdb

Instance of Apdb class, the type of the returned instance is determined by configuration.

abstract getConfig() ApdbConfig

Return APDB configuration for this instance, including any updates that may be read from database.

Returns:
configApdbConfig

APDB configuration.

abstract getDiaForcedSources(region: Region, object_ids: Iterable[int] | None, visit_time: Time, start_time: Time | None = None) DataFrame | None

Return catalog of DiaForcedSource instances from a given region.

Parameters:
regionlsst.sphgeom.Region

Region to search for DIASources.

object_idsiterable [ int ], optional

List of DiaObject IDs to further constrain the set of returned sources. If list is empty then empty catalog is returned with a correct schema. If None then returned sources are not constrained.

visit_timeastropy.time.Time

Time of the current visit. If APDB contains records later than this time they may also be returned.

start_timeastropy.time.Time, optional

Lower bound of time window for the query. If not specified then it is calculated using visit_time and read_forced_sources_months configuration parameter.

Returns:
catalogpandas.DataFrame, or None

Catalog containing DiaForcedSource records. None is returned if start_time is not specified and read_forced_sources_months configuration parameter is set to 0.

Raises:
NotImplementedError

May be raised by some implementations if object_ids is None.

Notes

This method returns DiaForcedSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by read_forced_sources_months config parameter, w.r.t. visit_time. If object_ids is empty then an empty catalog is always returned with the correct schema (columns/types). If object_ids is None then no filtering is performed and some of the returned records may be outside the specified region.

abstract getDiaObjects(region: Region) DataFrame

Return catalog of DiaObject instances from a given region.

This method returns only the last version of each DiaObject, and may return only the subset of the DiaObject columns needed for AP association. Some records in a returned catalog may be outside the specified region, it is up to a client to ignore those records or cleanup the catalog before futher use.

Parameters:
regionlsst.sphgeom.Region

Region to search for DIAObjects.

Returns:
catalogpandas.DataFrame

Catalog containing DiaObject records for a region that may be a superset of the specified region.

abstract getDiaObjectsForDedup(since: Time | None = None) DataFrame

Return catalog of DiaObject stored in APDB since specified time.

This method should be used by deduplication algorithm to retrieve DiaObject records added to APDB since previous deduplication (typically during previous night). Returned catalog will have only a small subset of DiaObject attributes required by deduplication algorithm.

Parameters:
sinceastropy.time.Time, optional

Starting search time (time of previous deduplication). If not provided the time of the last deduplication stored in metadata by resetDedup method is used.

Returns:
catalogpandas.DataFrame

Catalog containing DiaObject records, only a subset of attributes will be returned.

abstract getDiaSources(region: Region, object_ids: Iterable[int] | None, visit_time: Time, start_time: Time | None = None) DataFrame | None

Return catalog of DiaSource instances from a given region.

Parameters:
regionlsst.sphgeom.Region

Region to search for DIASources.

object_idsiterable [ int ], optional

List of DiaObject IDs to further constrain the set of returned sources. If None then returned sources are not constrained. If list is empty then empty catalog is returned with a correct schema.

visit_timeastropy.time.Time

Time of the current visit. If APDB contains records later than this time they may also be returned.

start_timeastropy.time.Time, optional

Lower bound of time window for the query. If not specified then it is calculated using visit_time and read_forced_sources_months configuration parameter.

Returns:
catalogpandas.DataFrame, or None

Catalog containing DiaSource records. None is returned if start_time is not specified and read_sources_months configuration parameter is set to 0.

Notes

This method returns DiaSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by read_sources_months config parameter, w.r.t. visit_time. If object_ids is empty then an empty catalog is always returned with the correct schema (columns/types). If object_ids is None then no filtering is performed and some of the returned records may be outside the specified region.

abstract getDiaSourcesForDiaObjects(objects: list[lsst.dax.apdb.recordIds.DiaObjectId], start_time: Time, max_dist_arcsec: float = 1.0) DataFrame

Return catalog of DiaSources associated with given DiaObjects.

Parameters:
objectslist [DiaObjectId]

DiaObjects associated with returned DiaSources.

start_timeastropy.time.Time

Lower bound for midpointMjdTai for returned DiaSources.

max_dist_arcsecfloat

Maximum expected distance in arcsec between DiaSource and DiaObject. This parameter is used to optimize spatial queries in cases when DiaObject is located near the partition boundary. If the distance from DiaObject to the boundary is smaller than max_dist_arcsec, then the neighbor partition will be included in search too.

Returns:
catalogpandas.DataFrame

Catalog containing DiaSource records associated to given DiaObjects.

Notes

Primary purpose of this method is to support deduplication algorithm. Its implementation is likely to be very slow and inefficient, it should not be used for regular queries.

abstract reassignDiaSources(idMap: Mapping[int, int]) None

Associate DiaSources with SSObjects, dis-associating them from DiaObjects.

Parameters:
idMapMapping

Maps DiaSource IDs to their new SSObject IDs.

Raises:
ValueError

Raised if DiaSource ID does not exist in the database.

abstract reassignDiaSourcesToDiaObjects(idMap: Mapping[DiaSourceId, int], *, increment_nDiaSources: bool = True, decrement_nDiaSources: bool = True) None

Re-assign DiaSources from one DiaObject to another, typically during deduplication.

Parameters:
idMapMapping [DiaSourceId, int]

Mapping from DiaSource to their new diaObjectId.

increment_nDiaSourcesbool, optional

If True then increment the value of nDiaSources in DiaObjects that DiaSources are reassigned to.

decrement_nDiaSourcesbool, optional

If True then decrement the value of nDiaSources in DiaObjects that DiaSources are reassigned from.

Raises:
LookupError

Raised if some of DiaSources or DiaObjects are not found.

Notes

DiaSources initially could be associated with SSObjects. This method needs to be called before setValidityEnd.

abstract resetDedup(dedup_time: Time | None = None) None

Delete deduplication-related data and remember deduplication time. Deduplication data generated before dedup_time will be erased.

Parameters:
dedup_timeastropy.time.Time, optional

Time of the last deduplication, current time is used if not provided.

abstract setValidityEnd(objects: list[lsst.dax.apdb.recordIds.DiaObjectId], validityEnd: Time, raise_on_missing_id: bool = False) int

Close validity interval for specified DiaObjects.

Parameters:
objectslist [DiaObjectId]

DiaObjects which will have their validityEnd updated, if their current validityEnd is NULL.

validityEndastropy.time.Time

Value for validityEnd.

raise_on_missing_idbool, optional

If True then LookupError will be raised if any object in the list is missing from the database.

Returns:
countint

Actual number of records for which validityEnd was updated.

Raises:
LookupError

Raised if raise_on_missing_id is True and some of the specified DiaObjects could not be found in the database.

Notes

This method has to be called after reassignDiaSourcesToDiaObjects.

abstract store(visit_time: Time, objects: DataFrame, sources: DataFrame | None = None, forced_sources: DataFrame | None = None) None

Store all three types of catalogs in the database.

Parameters:
visit_timeastropy.time.Time

Time of the visit.

objectspandas.DataFrame

Catalog with DiaObject records.

sourcespandas.DataFrame, optional

Catalog with DiaSource records.

forced_sourcespandas.DataFrame, optional

Catalog with DiaForcedSource records.

Notes

This methods takes DataFrame catalogs, their schema must be compatible with the schema of APDB table:

  • column names must correspond to database table columns

  • types and units of the columns must match database definitions, no unit conversion is performed presently

  • columns that have default values in database schema can be omitted from catalog

  • this method knows how to fill interval-related columns of DiaObject (validityStart, validityEnd) they do not need to appear in a catalog

  • source catalogs have diaObjectId column associating sources with objects

This operation need not be atomic, but DiaSources and DiaForcedSources will not be stored until all DiaObjects are stored.

abstract tableDef(table: ApdbTables) Table | None

Return table schema definition for a given table.

Parameters:
tableApdbTables

One of the known APDB tables.

Returns:
tableSchemaschema_model.Table or None

Table schema description, None is returned if table is not defined by this implementation.