Running ap_verify from the command line¶
ap_verify.py is a Python script designed to be run on both developer machines and verification servers.
While ap_verify.py is not a command-line task, the command-line interface is designed to resemble that of command-line tasks where practical.
This page describes the most common options used to run ap_verify
.
For more details, see the ap_verify command-line reference or run ap_verify.py -h
.
How to run ap_verify in a new workspace (Gen 2 pipeline)¶
Using the Cosmos PDR2 CI dataset as an example, one can run ap_verify.py as follows:
ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --gen2 --id "visit=59150^59160 filter=HSC-G" --output workspaces/cosmos/
Here the inputs are:
ap_verify_ci_cosmos_pdr2 is the
ap_verify
dataset to process,--gen2
specifies to process the dataset using the Gen 2 pipeline framework,visit=59150^59160 filter=HSC-G is the dataId to process,
while the output is:
workspaces/cosmos/ is the location where the pipeline will create any Butler repositories necessary,
This call will create a new directory at workspaces/cosmos
, ingest the Cosmos data into a new repository based on <cosmos-data>/repo/
, then run visits 59150 and 59160 through the entire AP pipeline.
It’s also possible to run an entire dataset by omitting the --id
argument (as some datasets are very large, do this with caution):
ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --gen2 --output workspaces/cosmos/
Note
The command-line interface for ap_verify.py is at present more limited than those of command-line tasks. See the ap_verify command-line reference for details.
How to run ap_verify in a new workspace (Gen 3 pipeline)¶
Using the Cosmos PDR2 CI dataset as an example, one can run ap_verify.py as follows:
ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --gen3 --data-query "visit in (59150, 59160) and band='g'" --output workspaces/cosmos/
Here the inputs are:
ap_verify_ci_cosmos_pdr2 is the
ap_verify
dataset to process,--gen3
specifies to process the dataset using the Gen 3 pipeline framework,visit in (59150, 59160) and band='g' is the data ID query to process,
while the output is:
workspaces/cosmos/ is the location where the pipeline will create a Butler repository along with other outputs such as the alert production database.
This call will create a new directory at workspaces/cosmos
, ingest the Cosmos data into a new repository, then run visits 59150 and 59160 through the entire AP pipeline.
It’s also possible to run an entire dataset by omitting the --data-query
argument (as some datasets are very large, do this with caution):
ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --gen3 --output workspaces/cosmos/
Note
Because the science pipelines are still being converted to Gen 3, Gen 3 processing may not be supported for all ap_verify datasets. See the individual dataset’s documentation for more details.
How to run ingestion by itself¶
ap_verify
includes a separate program, ingest_dataset.py, that ingests datasets into repositories but does not run the pipeline on them.
This is useful if the data need special processing or as a precursor to massive processing runs.
Running ap_verify.py with the same arguments as a previous run of ingest_dataset.py will automatically skip ingestion.
Using the Cosmos PDR2 dataset as an example, one can run ingest_dataset
in Gen 2 as follows:
ingest_dataset.py --dataset ap_verify_ci_cosmos_pdr2 --gen2 --output workspaces/cosmos/
The --dataset
, --output
, --gen2
, --gen3
, and --processes
arguments behave the same way as for ap_verify.py.
Other options from ap_verify.py are not available.
How to use measurements of metrics (Gen 2 pipeline)¶
After ap_verify
has run, it will produce files named, by default, ap_verify.<dataId>.verify.json
in the caller’s directory.
The file name may be customized using the --metrics-file
command-line argument.
These files contain metric measurements in lsst.verify
format, and can be loaded and read as described in the lsst.verify documentation or in SQR-019.
If the pipeline is interrupted by a fatal error, completed measurements will be saved to metrics files for debugging purposes. See the error-handling policy for details.
How to use measurements of metrics (Gen 3 pipeline)¶
After ap_verify
has run, it will produce Butler datasets named metricValue_<metric package>_<metric>
.
These can be queried, like any Butler dataset, using methods like queryDatasetTypes
and get
.
Note
Not all metric values need have the same data ID as the data run through the pipeline. For example, metrics describing the full focal plane have a visit but no detector.