Measurement tasks and algorithm plugins

Introduction

The meas_base package is the new home of the source measurement framework, which was formerly part of meas_algorithms.

The source measurement framework is a set of Python modules which allow measurements to be performed on calibrated exposures. The framework assumes that a detection catalog has been prepared to identify the objects to be measured. The detection catalog may have been produced with a detection pass on the exposure itself, but it might also be produced from other exposures, stacks, or even multiple exposures. Deblending may or may not have been performed during the creation of the detection catalog.

The framework steps through the detection catalog and performs a set of measurements, object by object, supplying each measurement plugin with the exposure and catalog information needed for its measurement. The measurement results (values, errors, and flags) are then placed in an output catalog.

The measurement framework includes the following features:

  • lsst.meas.base.sfm.SingleFrameMeasurementTask, a subtask that measures after performing detection and deblending on the same image.

  • Several tasks for forced photometry.

  • Python Plugin base classes (SingleFramePlugin, ForcedPlugin) for single-frame and forced measurement.

  • Helper code to reduce boilerplate, standardize outputs, and make algorithm code easy to reuse when implementing new measurement algorithms.

Single-frame measurement

The single frame measurement framework is used when all the information about the sources to be measured comes from a single image, and hence those sources are detected (and possibly) deblended on that image before measurement. This image may be a coadd of other images, or even a difference of images: from the perspective of the measurement framework there is essentially no difference between these cases (though there may be important differences for particular measurement algorithms).

The high-level algorithm for single-frame measurement is:

  • The lsst.meas.base.sfm.SingleFrameMeasurementTask is initialized. This initializes all the configured algorithms, creating a schema for the outputs in the process. After this stage, neither the schema nor the algorithm configuration can be modified.

  • SingleFrameMeasurementTask.run is called on each image to be processed with a SourceCatalog containing all of the sources to be measured. These sources must have Footprints (generated by lsst.meas.algorithms.SourceDetectionTask) and a schema that matches that constructed by the previous step. The fields added to the schema during initialization will then be filled in by the measurement framework.

    • Before measuring any sources, the measurement framework replaces all sources in the catalog with noise (see NoiseReplacer), using the Footprints attached to the SouceCatalog to define their boundaries.

    • We then loop over all “parent” sources in the catalog — both those that were not blended, and those that represent the pre-deblend combined state of blends. For each parent, we loop again over all its children (if any), and for each of these, we re-insert the child source into the image (which, recall, currently contains only noise) and call measure() on each of the plugins, and then replace the child source with noise again. We then insert the parent source, and again call measure() on all of the plugins. Before replacing the parent with noise again, we then call measureN() twice for each plugin: once with the list of all children, and once with a single-element list containing just the parent. This ensures that each source (parent or child) is measured with both measure() and measureN(), with the former preceeding the latter.

Because measurement plugin algorithms are often dependent on each other (in particular, most measurements require a centroid as an input), they must be run in a particular order, and we need a mechanism for passing information between them. The order is defined by the executionOrder config parameter, which is defined in lsst.meas.base.pluginsBase.BasePluginConfig, and hence present for every plugin. Generally, these will remain at their default values; it is the responsibility of a plugin implementor to ensure the default for that plugin is appropriate relative to any plugins it depends on. See BasePlugin for some guidelines.

The mechanism for passing information between plugins is SourceTables slot system (see lsst.afw.table.SlotDefinition), in which particular measurements are given one of several predefined aliases (e.g. slotCentroid may be an alias for base_SdssCentroid) which are used to implement getters on lsst.afw.table.SourceRecord (e.g getCentroid). The measurement framewor’s configuration defines which measurements are assigned to each slot, and these slot measurements are available to other plugins as soon as the plugin whose outputs are assigned to the slot is run.

All this means that algorithms that need a centroid as input should simply call getCentroid on the lsst.afw.table.SourceRecord with which they are provided, and ensure that their executionOrder is higher than that of centroid algorithms. Similarly, algorithms that want a shape should simply call getShape. Things are a bit trickier for centroid algorithms, which often need to be given an approximate centroid as an input; these should be prepared to look at the Peaks attached to the SourceRecord's lsst.afw.detection.Footprint as an initial value, as the slot centroid may not yet be valid. For wrapped C++ algorithms, this is handled automatically.

Forced photometry

In forced photometry, an external “reference” catalog is used to constrain measurements on an image. While parts of the forced photometry framework could be with a reference catalog from virtually any source, a complete system for loading the reference catalogs that correspond to the region of sky being measured is only available when measurements from a coadd are used as the reference.

While essentially any measurement plugin can be run in forced mode, typically only photometric measurements are scientifically useful (though centroids and shapes may be useful for quality metrics). In fact, in forced mode we typically configure pseudo-measurements to provide the shape and centroid slots, and it is these — rather than anything special about the forced measurement framework — that constrains measurements. In particular, we generally use the ForcedTransformedCentroidPlugin and ForcedTransformedShapePlugin to provide the centroid and shape slots. Rather than measure the centroid and shape on the image, these simply transform the centroid and shape slots from the reference catalog to the appropriate coordinate system. This ensures that measurements that use these slots to obtain positions and ellipses use the same quantities used in generating the reference catalog.

The core of the forced measurement framework is lsst.meas.base.forcedMeasurement.ForcedMeasurementTask and ForcedPlugin, which broadly parallel lsst.meas.base.sfm.SingleFrameMeasurementTask and SingleFramePlugin. The high-level algorithm is essentially the same, but with the SourceCatalog to be measured generated by ForcedMeasurementTask.generateSources from the reference catalog, rather than provided by the user after running detection. The corresponding reference source and the SkyWcs objects that define the mapping between reference and measurement coordinate systems are also provided to each plugin.

The fact that the sources to be measured are generated from the reference catalog means that the lsst.afw.detection.Footprints attached to these sources must be transformed from the reference coordinate system to the measurement coordinate system, and at present that operation turns “heavy” footprints (i.e., including pixel data) into regular lsst.afw.detection.Footprints. Heavy footprints for child sources are necessary in order to correctly replace neighboring children of the same parent with noise prior to measurement (see NoiseReplacer), and the lack of these means that deblended measurement in forced photometry is essentially broken, except for plugins that implement measureN and can hence correctly measure all children simultaneously without having to replace them with noise individually.

In addition to the lsst.meas.base.forcedMeasurement.ForcedMeasurementTask subtask and its plugins, the forced measurement framework also contains a pair of command-line driver tasks, lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask and lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask. These run forced measurement on CCD-level images and coadd patch images, respectively, using the outputs of a previous single-frame measurement run on coadds as the reference catalog in both cases. These delegate the work of loading (and as necessary, filtering and merging) the appropriate reference catalog for the measurement image to a references subtask. The interface for the reference subtask is defined by lsst.meas.base.references.BaseReferencesTask, with the concrete implementation that utilizes coadd processing outputs in lsst.meas.base.references.CoaddSrcReferencesTask. In general, to use a reference catalog from another source, one should implement a new references subtask, and reuse lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask and/or lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask. It should only be necessary to replace these and use lsst.meas.base.forcedMeasurement.ForcedMeasurementTask directly if you need to run forced photometry on data that isn’t organized by the Butler or doesn’t correspond to CCD- or patch-level images.

Implementing new plugins and algorithms

The “Plugin” interfaces used directly by the measurement tasks are defined completely in Python, and are rooted in the abstract base classes SingleFramePlugin and ForcedPlugin. There are also analogous C++ base classes, SingleFrameAlgorithm and ForcedAlgorithm, for plugins implemented in C++, as well as SimpleAlgorithm, a C++ base class for simple algorithms in which the same implementation can be used for both single-frame and forced measurement.

For a single-frame plugin or algorithm:

  • Subclass SingleFramePlugin (Python) or SingleFrameAlgorithm (C++).

  • Implement an __init__ method with the same signature as the base class, in which fields saved by the plugin should be added to the schema passed to __init__, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by wrapSingleFrameAlgorithm.

  • Reimplement measure() to perform the actual measurement and save the result in the measRecord argument.

  • Reimplement fail() unless the plugin cannot fail (except for environment errors).

  • Reimplement measureN() if the plugin supports measuring multiple sources simultaneously.

  • Register the plugin with the config mechanism by calling e.g. SingleFramePlugin.registry.register at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and call wrapSingleFrameAlgorithm to wrap and register the algorithm simultaneously.

For a forced plugin or algorithm:

  • Subclass ForcedPlugin (Python) or ForcedAlgorithm (C++).

  • Implement an __init__ method with the same signature as the base class, in which fields saved by the plugin should be added to the outputSchema of the SchemaMapper passed to __init__, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by by wrapForcedAlgorithm.

  • Reimplement measure() (in Python) or measureForced (C++) to perform the actual measurement and save the result in the measRecord argument. Note that the refRecord and refWcs are available during the measurement if needed.

  • Reimplement fail() unless the plugin cannot fail (except for environment errors).

  • Reimplement measureN() (Python) or measureNForced() (C++) if the plugin supports measuring multiple sources simultaneously.

  • Register the plugin with the config mechanism by calling e.g. ForcedPlugin.registry.register at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and call wrapForcedAlgorithm to wrap and register the algorithm simultaneously.

In C++, one can also implement both interfaces at the same time using SimpleAlgorithm; see that class for more information.

Error handling

When a plugin (or the C++ algorithm it delegates to) raises any exception, the task calling it will catch the exception, and call the fail() method of the plugin, which should cause the plugin to set one or more flags in the output record. If the exception is a MeasurementError, the task will pass this exception back to the fail() method, as MeasurementError contains additional, plugin-specific, information indicating the kind of failure. For most other exceptions, the task will log the exception message as a warning, and pass None as the exception to fail(). In this case, the plugin should just set the primary failure flag. This is handled automatically by the FlagHandler in C++-based plugins. Certain exceptions (in particular, out-of-memory errors) are considered fatal and will always be propagated up out of the task.

Plugin/algorithm code should endeavor to only throw MeasurementError for known failure modes, unless the problem is in the data and can always be fixed there before the measurement framework is invoked. In other words, we want warnings to appear in the logs only when there’s a bug, whether that’s in the processing before the measurement framework or in a particular plugin/algorithm not knowing and documenting its own failure modes. This means that plugin/algorithm implementations should generally have a global try/catch block that re-throwns lower-level exceptions corresponding to known failure modes as MeasurementErrors.

How plugin errors are logged

A plugin is usually not run by itself, but as a component of a measurement task. The measurement task may also be a component or “subtask” of another task, and so on. When a plugin is run, the measurement task which is running the plugin logs any error which the plugin throws to a log location within the task hierarchy. For example, when the PsfFlux plugin from is run within lsst.pipe.tasks.processCcd.ProcessCcdTask, its errors are logged to processCcd.charImage.measurement.base_PsfFlux

This log hierarchy allows the log for the PsfFlux plugin to be controlled independently of the other plugins, and also independently of the measurement task log. Measurement errors are typically logged at the DEBUG level. When lsst.pipe.tasks.processCcd.ProcessCcdTask is launched, you may selectively modify the log level of any level of the hierarchy:

processCcd.py -L processCcd.charImage.measurement=WARN

will set the logging level of the measurement task and all of its plugins WARN, whereas:

processCcd.py -L=processCcd.charImage.measurement.base_PsfFlux=DEBUG

will selectively set just the PsfFlux algorithm to DEBUG, leaving the task and the other plugins at their default log levels.

How a plugin can get its log name

Plugins which do not log internally do not need to know the name of their log. However, if you are writing a plugin and wish to have the plugin log messages to the log level described in the previous section, you must add the following to your plugin class (the class, not the instance):

  • The plugin class must have a class attribute named hasLogName.

  • The class attribute hasLogName must be set to True.

  • The class initializer must have a logName parameter.

When all of these conditions are satisfied, the measurement task will initialize the plugin with the logName it uses to log error messages, to enable the plugin to log to the same location. The plugin may then use one of its base class methods, BasePlugin.getLogName, to get the name of its log. The plugin may then get the logger:

logger = logging.getLogger(self.getLogName())

Setup for Python plugins which log

Here is an example of how a Python plugin which logs internally can be constructed. Note the class attribute hasLogName and the initialization with an optional logName parameter.

class SingleFrameTestPlugin(SingleFramePlugin):

    ConfigClass = SingleFrameTestConfig
    hasLogName = True

    def __init__(self, config, name, schema, metadata, logName=None):
        SingleFramePlugin.__init__(self, config, name, schema, metadata, logName=logName)

With this configuration, the running task will set the logName parameter when the plugin is initialized, and the plugin’s getLogName() method may subsequently be used to fetch it.

Though it might be overly verbose, a plugin could log at the INFO level each time its measure() method is invoked, using the same logger as the measurement task:

logging.getLogger(self.getLogName()).info("Starting a measurement.")

Setup for C++ algorithms which log

C++ algorithms which are called from Python tasks can also get the logName. To do so, they must have an optional logName argument in their constructor. Here is an example from PsfFluxAlgorithm:

PsfFluxAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema,
                 std::string const & logName = "");

The constructor should include the line:

_logName = logName.size() ? logName : name;

which sets the name to be used for the logger of this plugin to either logName, or to the name of the base name of the plugin if the optional logName argument has not been specified.

The following line in must also be added to allow this constructor to be accessed from Python.

cls.def(py::init<PsfFluxAlgorithm::Control const &, std::string const &, afw::table::Schema &,
    std::string const &>(),
    "ctrl"_a, "name"_a, "schema"_a, "logName"_a);

And finally, the hasLogName=True must be added to the Python wrapper:

wrapSimpleAlgorithm(PsfFluxAlgorithm, Control=PsfFluxControl,
            TransformClass=PsfFluxTransform, executionOrder=BasePlugin.FLUX_ORDER,
            shouldApCorr=True, hasLogName=True)

This constructor allows the algorithm to receive the logName as an optional string, which can later be accessed by its getLogName() method. In C++, an algorithm may then log to this logName as follows:

LOGL_INFO(logger, message...)

where logger can either be the logName string itself, or a logging.Logger object returned by

logger = logging.getLogger(getLogName());

Using a FlagHandler with Python plugins

Review the SingleFramePlugin requirements for the measure() and fail() methods, which a plugin must implement. When the plugin detects an error, it should raise a MeasurementError, which triggers a call to fail(). The fail() method should set the appropriate failure flags in the output catalog.

A FlagHandler is a convenient way for a plugin to define flags for different error conditions, and to automatically set the correct flags when they occur.

How to define a FlagHandler in Python

The meas_base plugins implemented in C++ use lsst::meas::base::FlagHander to handle measurement exceptions. In Python, measurement plugins may use a FlagDefinitionList to create an instance of this class.

First examine the code testFlagHandler.py in the meas_base tests directory. This unit test defines a Python plugin which illustrates the use of the FlagHander. The __init__ method shown below defines a list of 3 failure flags. As each flag is added, a FlagDefinition is returned which can be used to identify the error later.

When the list is complete, a FlagHandler is created, which initializes the flag fields in the output catalog:

During initialization, a Python plugin may create a FlagHandler with code like the following in __init__():

flagDefs = FlagDefinitionList()
FAILURE = flagDefs.add("flag", "General Failure error")
CONTAINS_NAN = flagDefs.add("flag_containsNan", "Measurement area contains a nan")
EDGE = flagDefs.add("flag_edge", "Measurement area over edge")
self.flagHandler = FlagHandler.addFields(schema, name, flagDefs)

This code defines the following error flags:

  • a general failure flag which indicates that something has gone wrong during measure() (flag);

  • a specific failure flag to indicate a source which contains one or more NaNs (flag_containsNan);

  • a specific failure flag which indicates that the source is too close to the edge to be measured (flag_edge).

The FlagHander thus created is then used to implement the fail() method. Recall that you must implement this method in your plugin class if your method can fail.

The following addition to your class will correctly implement the error handling:

def fail(self, measRecord, error=None):
    if error is None:
        self.flagHandler.handleFailure(measRecord)
    else:
        self.flagHandler.handleFailure(measRecord, error.cpp)

When the error is one which your plugin code expects, your code will raise a MeasurementError exception. The error is sent to the fail() method as its error argument. The error will indicate which flag should be set in addition to the general failure flag. If the error argument is not supplied, as when the failure is not expected, only the general failure flag will be set.

For example, if the following code condition was encountered during measurement:

if not exposure.getBBox().contains(bbox):
    raise MeasurementError(self.EDGE.doc, self.EDGE.number)

then both the general flag and the more specific flag_edge will be set to True.

Using the SafeCentroidExtractor

The SafeCentroidExtractor may be used to fetch a value for the centroid of a source even if the centroiding algorithm has failed for that source. This is achieved by falling back to the original detection Footprint for the source.

The SafeCentroidExtractor is defined in the __init__ method of the measurement plugin:

def __init__(self, config, name, schema, metadata):
    SingleFramePlugin.__init__(self, config, name, schema, metadata)
    self.centroidExtractor = lsst.meas.base.SafeCentroidExtractor(schema, name)

self.centroidExtractor() may then be called to fetch the centroid for a record:

center = self.centroidExtractor(measRecord, self.flagHandler)

The SafeCentroidExtractor will first try to read the centroid from the centroid slot. If the failure flag on the centroid slot has been set to True, it will try to use the detection Footprint to determine the centroid. This might potentially allow the plugin to complete its measurement if the centroid provided is adequate.

To indicate at the same time that something has gone wrong, the general flag will automatically get set on this source. The SafeCentroidExtractor will also create a flag called flag_badCentroid which points to the centroid slot failure flag, and can be used to distinguish records where the failure flag has been set because the centroid slot measurement was bad.