ApdbCassandra¶
- class lsst.dax.apdb.ApdbCassandra(config: ApdbCassandraConfig)¶
- Bases: - Apdb- Implementation of APDB database on to of Apache Cassandra. - The implementation is configured via standard - pex_configmechanism using- ApdbCassandraConfigconfiguration class. For an example of different configurations check config/ folder.- Parameters:
- configApdbCassandraConfig
- Configuration object. 
 
- config
 - Attributes Summary - Object controlling access to APDB metadata ( - ApdbMetadata).- Name of the metadata key to store code version number. - Name of the metadata key to store code version number. - Name of the metadata key to store replica code version number. - Name of the metadata key to store schema version number. - Start time for partition 0, this should never be changed. - Methods Summary - Return version number for current APDB implementation. - Return schema version number as defined in config file. - containsVisitDetector(visit, detector)- Test whether data for a given visit-detector is present in the APDB. - Return the number of DiaObjects that have only one DiaSource associated with them. - dailyJob()- Implement daily activities like cleanup/vacuum. - from_config(config)- Create Ppdb instance from configuration object. - from_uri(uri)- Make Apdb instance from a serialized configuration. - getDiaForcedSources(region, object_ids, ...)- Return catalog of DiaForcedSource instances from a given region. - getDiaObjects(region)- Return catalog of DiaObject instances from a given region. - getDiaSources(region, object_ids, visit_time)- Return catalog of DiaSource instances from a given region. - Return catalog of SSObject instances. - Return - ApdbReplicainstance for this database.- init_database(hosts, keyspace, *[, ...])- Initialize new APDB instance and make configuration object for it. - makeField(doc)- Make a - ConfigurableFieldfor Apdb.- reassignDiaSources(idMap)- Associate DiaSources with SSObjects, dis-associating them from DiaObjects. - store(visit_time, objects[, sources, ...])- Store all three types of catalogs in the database. - storeSSObjects(objects)- Store or update SSObject catalog. - tableDef(table)- Return table schema definition for a given table. - Attributes Documentation - metadata¶
 - metadataCodeVersionKey = 'version:ApdbCassandra'¶
- Name of the metadata key to store code version number. 
 - metadataConfigKey = 'config:apdb-cassandra.json'¶
- Name of the metadata key to store code version number. 
 - metadataReplicaVersionKey = 'version:ApdbCassandraReplica'¶
- Name of the metadata key to store replica code version number. 
 - metadataSchemaVersionKey = 'version:schema'¶
- Name of the metadata key to store schema version number. 
 - partition_zero_epoch = <Time object: scale='tai' format='unix_tai' value=0.0>¶
- Start time for partition 0, this should never be changed. 
 - Methods Documentation - classmethod apdbImplementationVersion() VersionTuple¶
- Return version number for current APDB implementation. - Returns:
- versionVersionTuple
- Version of the code defined in implementation class. 
 
- version
 
 - apdbSchemaVersion() VersionTuple¶
- Return schema version number as defined in config file. - Returns:
- versionVersionTuple
- Version of the schema defined in schema config file. 
 
- version
 
 - containsVisitDetector(visit: int, detector: int) bool¶
- Test whether data for a given visit-detector is present in the APDB. 
 - countUnassociatedObjects() int¶
- Return the number of DiaObjects that have only one DiaSource associated with them. - Used as part of ap_verify metrics. - Returns:
- countint
- Number of DiaObjects with exactly one associated DiaSource. 
 
- count
 - Notes - This method can be very inefficient or slow in some implementations. 
 - dailyJob() None¶
- Implement daily activities like cleanup/vacuum. - What should be done during daily activities is determined by specific implementation. 
 - classmethod from_config(config: ApdbConfig) Apdb¶
- Create Ppdb instance from configuration object. - Parameters:
- configApdbConfig
- Configuration object, type of this object determines type of the Apdb implementation. 
 
- config
- Returns:
- apdbapdb
- Instance of - Apdbclass.
 
- apdb
 
 - classmethod from_uri(uri: str | ParseResult | ResourcePath | Path) Apdb¶
- Make Apdb instance from a serialized configuration. - Parameters:
- uriResourcePathExpression
- URI or local file path pointing to a file with serialized configuration, or a string with a “label:” prefix. In the latter case, the configuration will be looked up from an APDB index file using the label name that follows the prefix. The APDB index file’s location is determined by the - DAX_APDB_INDEX_URIenvironment variable.
 
- uri
- Returns:
- apdbapdb
- Instance of - Apdbclass, the type of the returned instance is determined by configuration.
 
- apdb
 
 - getDiaForcedSources(region: Region, object_ids: Iterable[int] | None, visit_time: Time) DataFrame | None¶
- Return catalog of DiaForcedSource instances from a given region. - Parameters:
- regionlsst.sphgeom.Region
- Region to search for DIASources. 
- object_idsiterable [ int], optional
- List of DiaObject IDs to further constrain the set of returned sources. If list is empty then empty catalog is returned with a correct schema. If - Nonethen returned sources are not constrained. Some implementations may not support latter case.
- visit_timeastropy.time.Time
- Time of the current visit. 
 
- region
- Returns:
- catalogpandas.DataFrame, orNone
- Catalog containing DiaSource records. - Noneis returned if- read_forced_sources_monthsconfiguration parameter is set to 0.
 
- catalog
- Raises:
- NotImplementedError
- May be raised by some implementations if - object_idsis- None.
 
 - Notes - This method returns DiaForcedSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by - read_forced_sources_monthsconfig parameter, w.r.t.- visit_time. If- object_idsis empty then an empty catalog is always returned with the correct schema (columns/types). If- object_idsis- Nonethen no filtering is performed and some of the returned records may be outside the specified region.
 - getDiaObjects(region: Region) DataFrame¶
- Return catalog of DiaObject instances from a given region. - This method returns only the last version of each DiaObject. Some records in a returned catalog may be outside the specified region, it is up to a client to ignore those records or cleanup the catalog before futher use. - Parameters:
- regionlsst.sphgeom.Region
- Region to search for DIAObjects. 
 
- region
- Returns:
- catalogpandas.DataFrame
- Catalog containing DiaObject records for a region that may be a superset of the specified region. 
 
- catalog
 
 - getDiaSources(region: Region, object_ids: Iterable[int] | None, visit_time: Time) DataFrame | None¶
- Return catalog of DiaSource instances from a given region. - Parameters:
- regionlsst.sphgeom.Region
- Region to search for DIASources. 
- object_idsiterable [ int], optional
- List of DiaObject IDs to further constrain the set of returned sources. If - Nonethen returned sources are not constrained. If list is empty then empty catalog is returned with a correct schema.
- visit_timeastropy.time.Time
- Time of the current visit. 
 
- region
- Returns:
- catalogpandas.DataFrame, orNone
- Catalog containing DiaSource records. - Noneis returned if- read_sources_monthsconfiguration parameter is set to 0.
 
- catalog
 - Notes - This method returns DiaSource catalog for a region with additional filtering based on DiaObject IDs. Only a subset of DiaSource history is returned limited by - read_sources_monthsconfig parameter, w.r.t.- visit_time. If- object_idsis empty then an empty catalog is always returned with the correct schema (columns/types). If- object_idsis- Nonethen no filtering is performed and some of the returned records may be outside the specified region.
 - getSSObjects() DataFrame¶
- Return catalog of SSObject instances. - Returns:
- catalogpandas.DataFrame
- Catalog containing SSObject records, all existing records are returned. 
 
- catalog
 
 - get_replica() ApdbCassandraReplica¶
- Return - ApdbReplicainstance for this database.
 - classmethod init_database(hosts: list[str], keyspace: str, *, schema_file: str | None = None, schema_name: str | None = None, read_sources_months: int | None = None, read_forced_sources_months: int | None = None, use_insert_id: bool = False, use_insert_id_skips_diaobjects: bool = False, port: int | None = None, username: str | None = None, prefix: str | None = None, part_pixelization: str | None = None, part_pix_level: int | None = None, time_partition_tables: bool = True, time_partition_start: str | None = None, time_partition_end: str | None = None, read_consistency: str | None = None, write_consistency: str | None = None, read_timeout: int | None = None, write_timeout: int | None = None, ra_dec_columns: list[str] | None = None, replication_factor: int | None = None, drop: bool = False) ApdbCassandraConfig¶
- Initialize new APDB instance and make configuration object for it. - Parameters:
- hostslist[str]
- List of host names or IP addresses for Cassandra cluster. 
- keyspacestr
- Name of the keyspace for APDB tables. 
- schema_filestr, optional
- Location of (YAML) configuration file with APDB schema. If not specified then default location will be used. 
- schema_namestr, optional
- Name of the schema in YAML configuration file. If not specified then default name will be used. 
- read_sources_monthsint, optional
- Number of months of history to read from DiaSource. 
- read_forced_sources_monthsint, optional
- Number of months of history to read from DiaForcedSource. 
- use_insert_idbool, optional
- If True, make additional tables used for replication to PPDB. 
- use_insert_id_skips_diaobjectsbool, optional
- If - Truethen do not fill regular- DiaObjecttable when- use_insert_idis- True.
- portint, optional
- Port number to use for Cassandra connections. 
- usernamestr, optional
- User name for Cassandra connections. 
- prefixstr, optional
- Optional prefix for all table names. 
- part_pixelizationstr, optional
- Name of the MOC pixelization used for partitioning. 
- part_pix_levelint, optional
- Pixelization level. 
- time_partition_tablesbool, optional
- Create per-partition tables. 
- time_partition_startstr, optional
- Starting time for per-partition tables, in yyyy-mm-ddThh:mm:ss format, in TAI. 
- time_partition_endstr, optional
- Ending time for per-partition tables, in yyyy-mm-ddThh:mm:ss format, in TAI. 
- read_consistencystr, optional
- Name of the consistency level for read operations. 
- write_consistencystr, optional
- Name of the consistency level for write operations. 
- read_timeoutint, optional
- Read timeout in seconds. 
- write_timeoutint, optional
- Write timeout in seconds. 
- ra_dec_columnslist[str], optional
- Names of ra/dec columns in DiaObject table. 
- replication_factorint, optional
- Replication factor used when creating new keyspace, if keyspace already exists its replication factor is not changed. 
- dropbool, optional
- If - Truethen drop existing tables before re-creating the schema.
 
- hosts
- Returns:
- configApdbCassandraConfig
- Resulting configuration object for a created APDB instance. 
 
- config
 
 - classmethod makeField(doc: str) ConfigurableField¶
- Make a - ConfigurableFieldfor Apdb.- Parameters:
- docstr
- Help text for the field. 
 
- doc
- Returns:
- configurableFieldlsst.pex.config.ConfigurableField
- A - ConfigurableFieldfor Apdb.
 
- configurableField
 
 - reassignDiaSources(idMap: Mapping[int, int]) None¶
- Associate DiaSources with SSObjects, dis-associating them from DiaObjects. - Parameters:
- idMapMapping
- Maps DiaSource IDs to their new SSObject IDs. 
 
- idMap
- Raises:
- ValueError
- Raised if DiaSource ID does not exist in the database. 
 
 
 - store(visit_time: Time, objects: DataFrame, sources: DataFrame | None = None, forced_sources: DataFrame | None = None) None¶
- Store all three types of catalogs in the database. - Parameters:
- visit_timeastropy.time.Time
- Time of the visit. 
- objectspandas.DataFrame
- Catalog with DiaObject records. 
- sourcespandas.DataFrame, optional
- Catalog with DiaSource records. 
- forced_sourcespandas.DataFrame, optional
- Catalog with DiaForcedSource records. 
 
- visit_time
 - Notes - This methods takes DataFrame catalogs, their schema must be compatible with the schema of APDB table: - column names must correspond to database table columns 
- types and units of the columns must match database definitions, no unit conversion is performed presently 
- columns that have default values in database schema can be omitted from catalog 
- this method knows how to fill interval-related columns of DiaObject (validityStart, validityEnd) they do not need to appear in a catalog 
- source catalogs have - diaObjectIdcolumn associating sources with objects
 
 - storeSSObjects(objects: DataFrame) None¶
- Store or update SSObject catalog. - Parameters:
- objectspandas.DataFrame
- Catalog with SSObject records. 
 
- objects
 - Notes - If SSObjects with matching IDs already exist in the database, their records will be updated with the information from provided records. 
 - tableDef(table: ApdbTables) Table | None¶
- Return table schema definition for a given table. - Parameters:
- tableApdbTables
- One of the known APDB tables. 
 
- table
- Returns:
- tableSchemaschema_model.TableorNone
- Table schema description, - Noneis returned if table is not defined by this implementation.
 
- tableSchema