IngestIndexedReferenceTask

IngestIndexedReferenceTask converts an external catalog for use as an LSST Science Pipelines reference catalog, using a Hierarchical Triangular Mesh (HTM) indexing scheme. The format and layout of the input data is configurable. The output data is a collection of lsst.afw.table.SimpleCatalog files identified by their HTM pixel. This task is not available as a command-line task: see How to generate an LSST reference catalog for how to run the task.

Processing summary

IngestIndexedReferenceTask uses Python multiprocessing to ingest multiple files in parallel, configured by n_processes. Once it has generated the necessary multiprocessing file locks (one per output file: ~130,000 files for HTM depth=7), it performs the following steps for each input file:

  1. Reads the file using the configured file_reader subtask (default: ReadTextCatalogTask).
  2. Indexes the coordinates in the input data to determine which mesh pixel they go with, and thus which output file they will be written to.
  3. Loops over the output pixels in this input file (where N is the number of sources in this pixel):
  1. Acquires the lock for this output file.
  2. Reads an existing output file and appends N new empty rows, or generates a new empty catalog with N rows.
  3. Fills in the empty rows of the catalog with the converted values from the input data.
  4. Writes the output file and releases the file lock.

Python API summary

from lsst.meas.algorithms.ingestIndexReferenceTask import IngestIndexedReferenceTask
classIngestIndexedReferenceTask(*args, butler=None, **kwargs)

Class for producing and loading indexed reference catalogs...

attributeconfig

Access configuration fields and retargetable subtasks.

See also

See the IngestIndexedReferenceTask API reference for complete details.

Butler datasets

IngestIndexedReferenceTask does not behave in the same manner as most LSST Tasks. When run directly through the createIndexedCatalog method, IngestIndexedReferenceTask reads input from a collection of non-LSST files, and persists outputs to an output Butler data repository. Note that configurations for IngestIndexedReferenceTask, and its subtasks, affect what the output dataset content is.

Output datasets

ref_cat
An LSST-style reference catalog, consisting of one lsst.afw.table.SimpleCatalog per HTM pixel.

Retargetable subtasks

file_reader

Default
lsst.meas.algorithms.readTextCatalogTask.ReadTextCatalogTask
Field type
ConfigurableField
Task to use to read the files. Default is to expect text files.

Configuration fields

coord_err_unit

Default
None
Field type
str Field (optional)
Unit of RA/Dec error fields (astropy.unit.Unit compatible)

dataset_config

Data type
lsst.meas.algorithms.ingestIndexReferenceTask.DatasetConfig
Field type
ConfigField
Configuration for reading the ingested data

dec_err_name

Default
None
Field type
str Field (optional)
Name of Dec error column

dec_name

Default
None
Field type
str Field
Name of Dec column (values in decimal degrees)

epoch_format

Default
None
Field type
str Field (optional)
Format of epoch column: any value accepted by astropy.time.Time, e.g. ‘iso’ or ‘unix’

epoch_name

Default
None
Field type
str Field (optional)
Name of epoch column

epoch_scale

Default
None
Field type
str Field (optional)
Scale of epoch column: any value accepted by astropy.time.Time, e.g. ‘utc’

extra_col_names

Default
[]
Field type
str ListField
Extra columns to add to the reference catalog.

id_name

Default
None
Field type
str Field (optional)
Name of column to use as an identifier (optional).

is_photometric_name

Default
None
Field type
str Field (optional)
Name of column stating if satisfactory for photometric calibration (optional).

is_resolved_name

Default
None
Field type
str Field (optional)
Name of column stating if the object is resolved (optional).

is_variable_name

Default
None
Field type
str Field (optional)
Name of column stating if the object is measured to be variable (optional).

mag_column_list

Default
None
Field type
str ListField
The values in the reference catalog are assumed to be in AB magnitudes. List of column names to use for photometric information. At least one entry is required.

mag_err_column_map

Default
{}
Field type
DictField
Key type
str
Value type
str
A map of magnitude column name (key) to magnitude error column (value).

n_processes

Default
1
Field type
int Field
Number of python processes to use when ingesting.

parallax_err_name

Default
None
Field type
str Field (optional)
Name of parallax error column

parallax_name

Default
None
Field type
str Field (optional)
Name of parallax column

parallax_scale

Default
1.0
Field type
float Field
Scale factor by which to multiply parallax values to obtain units of milliarcsec

pm_dec_err_name

Default
None
Field type
str Field (optional)
Name of proper motion Dec error column

pm_dec_name

Default
None
Field type
str Field (optional)
Name of proper motion Dec column

pm_ra_err_name

Default
None
Field type
str Field (optional)
Name of proper motion RA error column

pm_ra_name

Default
None
Field type
str Field (optional)
Name of proper motion RA column

pm_scale

Default
1.0
Field type
float Field
Scale factor by which to multiply proper motion values to obtain units of milliarcsec/year

ra_err_name

Default
None
Field type
str Field (optional)
Name of RA error column

ra_name

Default
None
Field type
str Field
Name of RA column (values in decimal degrees)

Examples

See How to generate an LSST reference catalog for a description of how to run the task to ingest the Gaia DR2 catalog.