Getting started with the AP pipeline (Gen 2)¶
This page explains how to set up a Gen 2 data repository that can then be processed with the AP Pipeline (see Running the AP pipeline (Gen 2)). This is the established Science Pipelines workflow, and is compatible with a variety of existing pipelines and tools. However, it is expected to be phased out in the future in favor of the Gen 3 framework.
If you already have a Gen 3 data repository or want to learn the new framework, see Getting started with the AP pipeline (Gen 3).
Installation¶
lsst.ap.pipe is available from the LSST Science Pipelines.
It is installed as part of the lsst_apps
and lsst_distrib
metapackages.
Ingesting data files¶
LSST-style image processing typically operates on Butler repositories and does not directly interface with data files. lsst.ap.pipe is no exception. The process of turning a set of raw data files and corresponding calibration products into a format the Butler understands is called ingestion. Ingestion can be somewhat camera-specific, and is outside the scope of the AP Pipeline.
A utility to ingest data before running lsst.ap.pipe
is available in ap_verify. However, this works
only on datasets which adhere to the ap_verify dataset format.
Alternately, you may use a pre-
ingested dataset or manually ingest files yourself following the directions
for a given obs_
package.
A standard ingestion workflow for DECam looks something like
ingestImagesDecam.py input_loc --filetype raw path/to/raw/files --mode=link
ingestCuratedCalibs.py input_loc --calib calib_loc $OBS_DECAM_DATA_DIR/decam/defects
ingestCuratedCalibs.py input_loc --calib calib_loc $OBS_DECAM_DATA_DIR/decam/crosstalk
ingestCalibs.py input_loc --calib calib_loc /path/to/biases/and/flats --mode=link --validity 999
Required data products¶
For the AP Pipeline to successfully process data, the following is required:
- Raw science images and reference catalogs
ingested into a main Butler repository
- The reference catalogs must be in a directory called
ref_cats
with subdirectories for each catalog containing the appropriate catalog shards. We recommend using Pan-STARRS for photometry and Gaia for astrometry. An example config file for using these two catalogs can be found in the ap_verify_hits2015 repository.
- The reference catalogs must be in a directory called
- Calibration products (biases, flats, and possibly others)
ingested into a Butler repository that you must specify with the
--calib
flag on the command line at runtime- To check if this requirement has been satisfied, you can inspect the
calibRegistry.sqlite3
created in this repository and ensure the information in the tables is accurate
- To check if this requirement has been satisfied, you can inspect the
- Template images (of type
deepCoadd
by default) for difference imaging must be either in the main Butler repository or in another location you may specify with the--template
flag on the command line at runtime
A sample dataset from the DECam HiTS survey
that works with ap_pipe
in the The dataset framework format
is available as ap_verify_hits2015. However, this dataset must be
ingested as described in Ingesting data files, and the reference
catalog files must be decompressed and extracted.
Please continue to Pipeline Tutorial for more details about running the AP Pipeline and interpreting the results.