Getting started¶
lsst.faro
is part of the LSST Science Pipelines. If you are new to the LSST Science Pipelines, it may be helpful to begin with the Getting started tutorial and installation instructions.
If developing on Rubin computing facilities, a shared version of the software stack should be available for use.
Running faro¶
Running and building
lsst.faro
locally.Running faro on an reprocessed Gen3 repository at NCSA
lsst.faro
is can be run using pipetask.
Example: rc2_subset¶
Running faro
on a small local dataset. The rc2_subset is the smallest CI dataset for which all faro
metrics can be run without error and produce meaingful results.
Set up
rc2_subset
following the instructions here.Set up
faro
package; see setting up.An example command (update the command):
pipetask run -b $RC2_SUBSET_DIR/SMALL_HSC/butler.yaml -p $FARO_DIR/pipelines/metrics_pipeline_matched.yaml -i u/$USER/single_frame -o u/$USER/faro_matched_visits_r --register-dataset-types -d "instrument='HSC' AND detector=42 AND band='r'"
Documentation for using the pipetask run
command and various options can be found here. Briefly, the example command above uses the -b
option to specify the Butler repository, -p
to specify the pipeline, -i
to specify the input collection, -o
to specify the output collection (this should almost always be a user collection prefixed with u/username/
unless you are running in production), and -d
to provide a query to select a subset of data on which to compute metrics.
Warning
The --register-dataset-types
option should be used with caution as this will allow the registration of new dataset types that are global across the repository.
Example: HSC RC2 dataset¶
Running faro
on a Gen3 repository at NCSA. The HSC RC2 data that is reprocessed monthly with the latest version of the Science Pipelines is a good example, see DMTN-091. Information on the current status of HSC RC2 re-processing and latest runs can be found here.
Set up
lsst.faro
package; see setting_up.An example command, in this case running metrics on the source catalog of single visit:
pipetask --long-log run -b /repo/main/butler.yaml --register-dataset-types -p $FARO_DIR/pipelines/measurement/measurement_detector_table.yaml -d "visit=35892 AND skymap='hsc_rings_v1' AND instrument='HSC'" --output u/$USER/faro_test -i HSC/runs/RC2/w_2021_18/DM-29973 --timeout 999999
Use your username.
Example: DRP processing¶
lsst.faro
can be run together with other processing steps in a pipeline, e.g., as part of DRP processing.
Examples of this functionality can be found in the rc2_subset. One could follow the steps in this tutorial for more information.
Adding a metric to faro¶
Before making contributions to faro, we recommend to consult the LSST DM Developers Guide as a general reference for software development in Rubin DM, and in particular, the best practices covered in the DM development workflow.
Normative Science Verification Metrics¶
lsst.faro
is used for both science verification as well as scientific validation and charactization.
Normative metrics are associated with science performance requirements defined in the DMSR, OSS, and LSR that will be verified by the Rubin Observatory Construction Project. If you are intending to implement a normative metric, please read the information below; for non-normative metrics skip to the next section.
Please contact the core development team by posting on the #rubinobs-science-verification Slack channel or by reaching out to one of the main developers. This will facilitate coordination and scheduling of work.
Review the detailed metric specification and algorithm definition. Detailed requirement specifications and associated test cases are being developed in the LSST Verification and Validation (LVV) project in JIRA. (For more systems engineering details, see the LSST Verification & Validation Documentation and LSST Verification Architecture.)
Planning Work¶
Create JIRA ticket.
faro
has been tracking development using 6-month work cycles, i.e., JIRA epics. There is also a backlog epic. When starting faro development, or making a bugfix, create a JIRA ticket. Include “faro” as a Component and set the team as “DM Science”. It is recommended to contact thefaro
team to help everyone stay on the same page.
Setting Up¶
Development can be done from the Rubin Science Platform (RSP) notebook aspect, lsst-devl services, or using Docker image containing the Science Pipelines software. If using the RSP, suggest to read the tutorial on developing Science Pipelines in the notebook aspect.
Set up Science Pipelines:
source /software/lsstsw/stack/loadLSST.bash setup lsst_distrib
The example above points to a shared version of the software stack on the GPFS file systems.
-
git clone https://github.com/lsst/faro.git
This is a local version of faro
package to do development work.
Set up local version of the
faro
package.cd faro setup -k -r .
At this point you can verify that you are using your local version:
eups list -s | grep faro
-
git checkout -b git checkout -b tickets/DM-NNNNN
All development should happen on ticket branches (and should have associated JIRA tickets). User branches (e.g., u/jcarlin/
) can be used for experimenting/testing.
Adding a Metric¶
Identify the analysis context. Review the associated connections, config, and task base classes for that analysis context to understand the in-memory python objects that will be passed to the
run
method of the metric measurement task and the configuration options. See design concepts for more information. Currently implemented analysis contexts are listed here.Implement Measurement task. This will be an instance of
lsst.pipe.base.Task
that performs the specific operations of a given metric. SeeNumSourcesTask
defined in BaseSubTasks.py for a simple example metric that returns the number of rows in an input source/object catalog. Additional examples of measurement tasks can be found in thepython/lsst/faro/measurement
directory of the package.Implement unit tests. All algorithmic code used for metric computation should have associated unit tests. Examples can be found in the package
tests
directory.Add metric to a pipeline yaml file. The pipeline yaml contains the configuration information to execute metrics. See
measurement_visit_table.yaml
for an example that usesVisitTableMeasurementTask
to count the number of rows in an input source/object catalog. Additional examples of pipeline files can be found inpipelines/measurement
directory of the package.Name the metric. Currently each metric is associated with separately named dataset type that is global (more info here). To date, metric names have followed the pattern “metricvalue_{package}_{metric}” where the “package” and “metric” are given in the yaml configuration file. Metric naming conventions is an area of active development and it is recommended to contact the
faro
development team for up-to-date guidance.
Review¶
The following is brief summary of the steps for Review preparation.
Run unit tests with scons. Run scons from the top level directory of the package.
scons
Build package documentation locally. From the top level package directory:
package-docs build
Run continuous Integration test with Jenkins. Now that we have tested the package on its own, it is time to test integration with the rest of the Science Pipelines. When running the Jenkins test, the list of EUPS packages to build should include
lsst_distrib lsst_ci ci_hsc_gen3 ci_imsim
. The latter two EUPS packages will run CI tests that include executingfaro
on DRP products.Merge. Rebase if needed – see pushing code.