Measurement tasks and algorithm plugins¶
Introduction¶
The meas_base package is the new home of the source measurement framework, which was formerly part of meas_algorithms.
The source measurement framework is a set of Python modules which allow measurements to be performed on calibrated exposures. The framework assumes that a detection catalog has been prepared to identify the objects to be measured. The detection catalog may have been produced with a detection pass on the exposure itself, but it might also be produced from other exposures, stacks, or even multiple exposures. Deblending may or may not have been performed during the creation of the detection catalog.
The framework steps through the detection catalog and performs a set of measurements, object by object, supplying each measurement plugin with the exposure and catalog information needed for its measurement. The measurement results (values, errors, and flags) are then placed in an output catalog.
The measurement framework includes the following features:
lsst.meas.base.sfm.SingleFrameMeasurementTask
, a subtask that measures after performing detection and deblending on the same image.Several tasks for forced photometry.
Python Plugin base classes (
SingleFramePlugin
,ForcedPlugin
) for single-frame and forced measurement.Helper code to reduce boilerplate, standardize outputs, and make algorithm code easy to reuse when implementing new measurement algorithms.
Single-frame measurement¶
The single frame measurement framework is used when all the information about the sources to be measured comes from a single image, and hence those sources are detected (and possibly) deblended on that image before measurement. This image may be a coadd of other images, or even a difference of images: from the perspective of the measurement framework there is essentially no difference between these cases (though there may be important differences for particular measurement algorithms).
The high-level algorithm for single-frame measurement is:
The
lsst.meas.base.sfm.SingleFrameMeasurementTask
is initialized. This initializes all the configured algorithms, creating a schema for the outputs in the process. After this stage, neither the schema nor the algorithm configuration can be modified.SingleFrameMeasurementTask.run
is called on each image to be processed with aSourceCatalog
containing all of the sources to be measured. These sources must haveFootprint
s (generated bylsst.meas.algorithms.SourceDetectionTask
) and a schema that matches that constructed by the previous step. The fields added to the schema during initialization will then be filled in by the measurement framework.Before measuring any sources, the measurement framework replaces all sources in the catalog with noise (see
NoiseReplacer
), using theFootprint
s attached to theSouceCatalog
to define their boundaries.We then loop over all “parent” sources in the catalog — both those that were not blended, and those that represent the pre-deblend combined state of blends. For each parent, we loop again over all its children (if any), and for each of these, we re-insert the child source into the image (which, recall, currently contains only noise) and call
measure()
on each of the plugins, and then replace the child source with noise again. We then insert the parent source, and again callmeasure()
on all of the plugins. Before replacing the parent with noise again, we then callmeasureN()
twice for each plugin: once with the list of all children, and once with a single-element list containing just the parent. This ensures that each source (parent or child) is measured with bothmeasure()
andmeasureN()
, with the former preceeding the latter.
Because measurement plugin algorithms are often dependent on each other (in particular, most measurements require a centroid as an input), they must be run in a particular order, and we need a mechanism for passing information between them.
The order is defined by the executionOrder
config parameter, which is defined in lsst.meas.base.pluginsBase.BasePluginConfig
, and hence present for every plugin.
Generally, these will remain at their default values; it is the responsibility of a plugin implementor to ensure the default for that plugin is appropriate relative to any plugins it depends on.
See BasePlugin
for some guidelines.
The mechanism for passing information between plugins is SourceTable
s slot system (see lsst.afw.table.SlotDefinition
), in which particular measurements are given one of several predefined aliases (e.g. slotCentroid
may be an alias for base_SdssCentroid
) which are used to implement getters on lsst.afw.table.SourceRecord
(e.g getCentroid
).
The measurement framewor’s configuration defines which measurements are assigned to each slot, and these slot measurements are available to other plugins as soon as the plugin whose outputs are assigned to the slot is run.
All this means that algorithms that need a centroid as input should simply call getCentroid
on the lsst.afw.table.SourceRecord
with which they are provided, and ensure that their executionOrder
is higher than that of centroid algorithms.
Similarly, algorithms that want a shape should simply call getShape
.
Things are a bit trickier for centroid algorithms, which often need to be given an approximate centroid as an input; these should be prepared to look at the Peak
s attached to the SourceRecord
's lsst.afw.detection.Footprint
as an initial value, as the slot centroid may not yet be valid.
For wrapped C++ algorithms, this is handled automatically.
Forced photometry¶
In forced photometry, an external “reference” catalog is used to constrain measurements on an image. While parts of the forced photometry framework could be with a reference catalog from virtually any source, a complete system for loading the reference catalogs that correspond to the region of sky being measured is only available when measurements from a coadd are used as the reference.
While essentially any measurement plugin can be run in forced mode, typically only photometric measurements are scientifically useful (though centroids and shapes may be useful for quality metrics).
In fact, in forced mode we typically configure pseudo-measurements to provide the shape and centroid slots, and it is these — rather than anything special about the forced measurement framework — that constrains measurements.
In particular, we generally use the ForcedTransformedCentroidPlugin
and ForcedTransformedShapePlugin
to provide the centroid and shape slots.
Rather than measure the centroid and shape on the image, these simply transform the centroid and shape slots from the reference catalog to the appropriate coordinate system.
This ensures that measurements that use these slots to obtain positions and ellipses use the same quantities used in generating the reference catalog.
The core of the forced measurement framework is lsst.meas.base.forcedMeasurement.ForcedMeasurementTask
and ForcedPlugin
, which broadly parallel lsst.meas.base.sfm.SingleFrameMeasurementTask
and SingleFramePlugin
.
The high-level algorithm is essentially the same, but with the SourceCatalog
to be measured generated by ForcedMeasurementTask.generateSources
from the reference catalog, rather than provided by the user after running detection.
The corresponding reference source and the SkyWcs
objects that define the mapping between reference and measurement coordinate systems are also provided to each plugin.
The fact that the sources to be measured are generated from the reference catalog means that the lsst.afw.detection.Footprint
s attached to these sources must be transformed from the reference coordinate system to the measurement coordinate system, and at present that operation turns “heavy” footprints (i.e., including pixel data) into regular lsst.afw.detection.Footprint
s.
Heavy footprints for child sources are necessary in order to correctly replace neighboring children of the same parent with noise prior to measurement (see NoiseReplacer
), and the lack of these means that deblended measurement in forced photometry is essentially broken, except for plugins that implement measureN
and can hence correctly measure all children simultaneously without having to replace them with noise individually.
In addition to the lsst.meas.base.forcedMeasurement.ForcedMeasurementTask
subtask and its plugins, the forced measurement framework also contains a pair of command-line driver tasks, lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask
and lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask
.
These run forced measurement on CCD-level images and coadd patch images, respectively, using the outputs of a previous single-frame measurement run on coadds as the reference catalog in both cases.
These delegate the work of loading (and as necessary, filtering and merging) the appropriate reference catalog for the measurement image to a references
subtask.
The interface for the reference subtask is defined by lsst.meas.base.references.BaseReferencesTask
, with the concrete implementation that utilizes coadd processing outputs in lsst.meas.base.references.CoaddSrcReferencesTask
.
In general, to use a reference catalog from another source, one should implement a new references subtask, and reuse lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask
and/or lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask
.
It should only be necessary to replace these and use lsst.meas.base.forcedMeasurement.ForcedMeasurementTask
directly if you need to run forced photometry on data that isn’t organized by the Butler or doesn’t correspond to CCD- or patch-level images.
Implementing new plugins and algorithms¶
The “Plugin” interfaces used directly by the measurement tasks are defined completely in Python, and are rooted in the abstract base classes SingleFramePlugin
and ForcedPlugin
.
There are also analogous C++ base classes, SingleFrameAlgorithm
and ForcedAlgorithm
, for plugins implemented in C++, as well as SimpleAlgorithm
, a C++ base class for simple algorithms in which the same implementation can be used for both single-frame and forced measurement.
For a single-frame plugin or algorithm:
Subclass
SingleFramePlugin
(Python) orSingleFrameAlgorithm
(C++).Implement an
__init__
method with the same signature as the base class, in which fields saved by the plugin should be added to the schema passed to__init__
, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported bywrapSingleFrameAlgorithm
.Reimplement
measure()
to perform the actual measurement and save the result in themeasRecord
argument.Reimplement
fail()
unless the plugin cannot fail (except for environment errors).Reimplement
measureN()
if the plugin supports measuring multiple sources simultaneously.Register the plugin with the config mechanism by calling e.g.
SingleFramePlugin.registry.register
at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and callwrapSingleFrameAlgorithm
to wrap and register the algorithm simultaneously.
For a forced plugin or algorithm:
Subclass
ForcedPlugin
(Python) orForcedAlgorithm
(C++).Implement an
__init__
method with the same signature as the base class, in which fields saved by the plugin should be added to theoutputSchema
of theSchemaMapper
passed to__init__
, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by bywrapForcedAlgorithm
.Reimplement
measure()
(in Python) ormeasureForced
(C++) to perform the actual measurement and save the result in themeasRecord
argument. Note that therefRecord
andrefWcs
are available during the measurement if needed.Reimplement
fail()
unless the plugin cannot fail (except for environment errors).Reimplement
measureN()
(Python) ormeasureNForced()
(C++) if the plugin supports measuring multiple sources simultaneously.Register the plugin with the config mechanism by calling e.g.
ForcedPlugin.registry.register
at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and callwrapForcedAlgorithm
to wrap and register the algorithm simultaneously.
In C++, one can also implement both interfaces at the same time using SimpleAlgorithm
; see that class for more information.
Error handling¶
When a plugin (or the C++ algorithm it delegates to) raises any exception, the task calling it will catch the exception, and call the fail()
method of the plugin, which should cause the plugin to set one or more flags in the output record.
If the exception is a MeasurementError
, the task will pass this exception back to the fail()
method, as MeasurementError
contains additional, plugin-specific, information indicating the kind of failure.
For most other exceptions, the task will log the exception message as a warning, and pass None
as the exception to fail()
.
In this case, the plugin should just set the primary failure flag.
This is handled automatically by the FlagHandler
in C++-based plugins.
Certain exceptions (in particular, out-of-memory errors) are considered fatal and will always be propagated up out of the task.
Plugin/algorithm code should endeavor to only throw MeasurementError
for known failure modes, unless the problem is in the data and can always be fixed there before the measurement framework is invoked.
In other words, we want warnings to appear in the logs only when there’s a bug, whether that’s in the processing before the measurement framework or in a particular plugin/algorithm not knowing and documenting its own failure modes.
This means that plugin/algorithm implementations should generally have a global try/catch block that re-throwns lower-level exceptions corresponding to known failure modes as MeasurementError
s.
How plugin errors are logged¶
A plugin is usually not run by itself, but as a component of a measurement task.
The measurement task may also be a component or “subtask” of another task, and so on.
When a plugin is run, the measurement task which is running the plugin logs any error which the plugin throws to a log location within the task hierarchy.
For example, when the PsfFlux
plugin from is run within lsst.pipe.tasks.processCcd.ProcessCcdTask
, its errors are logged to processCcd.charImage.measurement.base_PsfFlux
This log hierarchy allows the log for the PsfFlux
plugin to be controlled independently of the other plugins, and also independently of the measurement task log.
Measurement errors are typically logged at the DEBUG
level.
When lsst.pipe.tasks.processCcd.ProcessCcdTask
is launched, you may selectively modify the log level of any level of the hierarchy:
processCcd.py -L processCcd.charImage.measurement=WARN
will set the logging level of the measurement task and all of its plugins WARN
, whereas:
processCcd.py -L=processCcd.charImage.measurement.base_PsfFlux=DEBUG
will selectively set just the PsfFlux
algorithm to DEBUG
, leaving the task and the other plugins at their default log levels.
How a plugin can get its log name¶
Plugins which do not log internally do not need to know the name of their log. However, if you are writing a plugin and wish to have the plugin log messages to the log level described in the previous section, you must add the following to your plugin class (the class, not the instance):
The plugin class must have a class attribute named
hasLogName
.The class attribute
hasLogName
must be set toTrue
.The class initializer must have a
logName
parameter.
When all of these conditions are satisfied, the measurement task will initialize the plugin with the logName
it uses to log error messages, to enable the plugin to log to the same location.
The plugin may then use one of its base class methods, BasePlugin.getLogName
, to get the name of its log.
The plugin may then get the logger:
logger = logging.getLogger(self.getLogName())
Setup for Python plugins which log¶
Here is an example of how a Python plugin which logs internally can be constructed.
Note the class attribute hasLogName
and the initialization with an optional logName
parameter.
class SingleFrameTestPlugin(SingleFramePlugin):
ConfigClass = SingleFrameTestConfig
hasLogName = True
def __init__(self, config, name, schema, metadata, logName=None):
SingleFramePlugin.__init__(self, config, name, schema, metadata, logName=logName)
With this configuration, the running task will set the logName
parameter when the plugin is initialized, and the plugin’s getLogName()
method may subsequently be used to fetch it.
Though it might be overly verbose, a plugin could log at the INFO
level each time its measure()
method is invoked, using the same logger as the measurement task:
logging.getLogger(self.getLogName()).info("Starting a measurement.")
Setup for C++ algorithms which log¶
C++ algorithms which are called from Python tasks can also get the logName
.
To do so, they must have an optional logName
argument in their constructor.
Here is an example from PsfFluxAlgorithm
:
PsfFluxAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema,
std::string const & logName = "");
The constructor should include the line:
_logName = logName.size() ? logName : name;
which sets the name to be used for the logger of this plugin to either logName
, or to the name of the base name of the plugin if the optional logName
argument has not been specified.
The following line in must also be added to allow this constructor to be accessed from Python.
cls.def(py::init<PsfFluxAlgorithm::Control const &, std::string const &, afw::table::Schema &,
std::string const &>(),
"ctrl"_a, "name"_a, "schema"_a, "logName"_a);
And finally, the hasLogName=True
must be added to the Python wrapper:
wrapSimpleAlgorithm(PsfFluxAlgorithm, Control=PsfFluxControl,
TransformClass=PsfFluxTransform, executionOrder=BasePlugin.FLUX_ORDER,
shouldApCorr=True, hasLogName=True)
This constructor allows the algorithm to receive the logName
as an optional string, which can later be accessed by its getLogName()
method.
In C++, an algorithm may then log to this logName
as follows:
LOGL_INFO(logger, message...)
where logger
can either be the logName
string itself, or a logging.Logger
object returned by
logger = logging.getLogger(getLogName());
Using a FlagHandler
with Python plugins¶
Review the SingleFramePlugin
requirements for the measure()
and fail()
methods, which a plugin must implement.
When the plugin detects an error, it should raise a MeasurementError
, which triggers a call to fail()
.
The fail()
method should set the appropriate failure flags in the output catalog.
A FlagHandler
is a convenient way for a plugin to define flags for different error conditions, and to automatically set the correct flags when they occur.
How to define a FlagHandler
in Python¶
The meas_base plugins implemented in C++ use lsst::meas::base::FlagHander
to handle measurement exceptions.
In Python, measurement plugins may use a FlagDefinitionList
to create an instance of this class.
First examine the code testFlagHandler.py in the meas_base tests directory. This unit test defines a Python plugin which illustrates the use of the FlagHander. The __init__ method shown below defines a list of 3 failure flags. As each flag is added, a FlagDefinition is returned which can be used to identify the error later.
When the list is complete, a FlagHandler is created, which initializes the flag fields in the output catalog:
During initialization, a Python plugin may create a FlagHandler
with code like the following in __init__()
:
flagDefs = FlagDefinitionList()
FAILURE = flagDefs.add("flag", "General Failure error")
CONTAINS_NAN = flagDefs.add("flag_containsNan", "Measurement area contains a nan")
EDGE = flagDefs.add("flag_edge", "Measurement area over edge")
self.flagHandler = FlagHandler.addFields(schema, name, flagDefs)
This code defines the following error flags:
a general failure flag which indicates that something has gone wrong during
measure()
(flag
);a specific failure flag to indicate a source which contains one or more NaNs (
flag_containsNan
);a specific failure flag which indicates that the source is too close to the edge to be measured (
flag_edge
).
The FlagHander
thus created is then used to implement the fail()
method.
Recall that you must implement this method in your plugin class if your method can fail.
The following addition to your class will correctly implement the error handling:
def fail(self, measRecord, error=None):
if error is None:
self.flagHandler.handleFailure(measRecord)
else:
self.flagHandler.handleFailure(measRecord, error.cpp)
When the error is one which your plugin code expects, your code will raise a MeasurementError
exception.
The error is sent to the fail()
method as its error
argument.
The error will indicate which flag should be set in addition to the general failure flag.
If the error argument is not supplied, as when the failure is not expected, only the general failure flag will be set.
For example, if the following code condition was encountered during measurement:
if not exposure.getBBox().contains(bbox):
raise MeasurementError(self.EDGE.doc, self.EDGE.number)
then both the general flag
and the more specific flag_edge
will be set to True
.
Using the SafeCentroidExtractor
¶
The SafeCentroidExtractor
may be used to fetch a value for the centroid of a source even if the centroiding algorithm has failed for that source.
This is achieved by falling back to the original detection Footprint
for the source.
The SafeCentroidExtractor
is defined in the __init__
method of the measurement plugin:
def __init__(self, config, name, schema, metadata):
SingleFramePlugin.__init__(self, config, name, schema, metadata)
self.centroidExtractor = lsst.meas.base.SafeCentroidExtractor(schema, name)
self.centroidExtractor()
may then be called to fetch the centroid for a record:
center = self.centroidExtractor(measRecord, self.flagHandler)
The SafeCentroidExtractor
will first try to read the centroid from the centroid slot.
If the failure flag on the centroid slot has been set to True
, it will try to use the detection Footprint
to determine the centroid.
This might potentially allow the plugin to complete its measurement if the centroid provided is adequate.
To indicate at the same time that something has gone wrong, the general flag will automatically get set on this source.
The SafeCentroidExtractor
will also create a flag called flag_badCentroid
which points to the centroid slot failure flag, and can be used to distinguish records where the failure flag has been set because the centroid slot measurement was bad.