.. py:currentmodule:: lsst.meas.base ####################################### Measurement tasks and algorithm plugins ####################################### Introduction ============ The meas_base package is the new home of the source measurement framework, which was formerly part of meas_algorithms. The source measurement framework is a set of Python modules which allow measurements to be performed on calibrated exposures. The framework assumes that a detection catalog has been prepared to identify the objects to be measured. The detection catalog may have been produced with a detection pass on the exposure itself, but it might also be produced from other exposures, stacks, or even multiple exposures. Deblending may or may not have been performed during the creation of the detection catalog. The framework steps through the detection catalog and performs a set of measurements, object by object, supplying each measurement plugin with the exposure and catalog information needed for its measurement. The measurement results (values, errors, and flags) are then placed in an output catalog. The measurement framework includes the following features: - :lsst-task:`lsst.meas.base.sfm.SingleFrameMeasurementTask`, a subtask that measures after performing detection and deblending on the same image. - Several tasks for forced photometry. - Python Plugin base classes (`SingleFramePlugin`, `ForcedPlugin`) for single-frame and forced measurement. - Helper code to reduce boilerplate, standardize outputs, and make algorithm code easy to reuse when implementing new measurement algorithms. Single-frame measurement ======================== The single frame measurement framework is used when all the information about the sources to be measured comes from a single image, and hence those sources are detected (and possibly) deblended on that image before measurement. This image may be a coadd of other images, or even a difference of images: from the perspective of the measurement framework there is essentially no difference between these cases (though there may be important differences for particular measurement algorithms). The high-level algorithm for single-frame measurement is: - The :lsst-task:`lsst.meas.base.sfm.SingleFrameMeasurementTask` is initialized. This initializes all the configured algorithms, creating a schema for the outputs in the process. After this stage, neither the schema nor the algorithm configuration can be modified. - `SingleFrameMeasurementTask.run` is called on each image to be processed with a `~lsst.afw.table.SourceCatalog` containing all of the sources to be measured. These sources must have `~lsst.afw.table.Footprint`\s (generated by :lsst-task:`lsst.meas.algorithms.SourceDetectionTask`) and a schema that matches that constructed by the previous step. The fields added to the schema during initialization will then be filled in by the measurement framework. - Before measuring any sources, the measurement framework replaces all sources in the catalog with noise (see `NoiseReplacer`), using the `~lsst.afw.detection.Footprint`\s attached to the `~lsst.afw.table.SouceCatalog` to define their boundaries. - We then loop over all "parent" sources in the catalog --- both those that were not blended, and those that represent the pre-deblend combined state of blends. For each parent, we loop again over all its children (if any), and for each of these, we re-insert the child source into the image (which, recall, currently contains only noise) and call ``measure()`` on each of the plugins, and then replace the child source with noise again. We then insert the parent source, and again call ``measure()`` on all of the plugins. Before replacing the parent with noise again, we then call ``measureN()`` twice for each plugin: once with the list of all children, and once with a single-element list containing just the parent. This ensures that each source (parent or child) is measured with both ``measure()`` and ``measureN()``, with the former preceeding the latter. Because measurement plugin algorithms are often dependent on each other (in particular, most measurements require a centroid as an input), they must be run in a particular order, and we need a mechanism for passing information between them. The order is defined by the ``executionOrder`` config parameter, which is defined in :lsst-config:`lsst.meas.base.pluginsBase.BasePluginConfig`, and hence present for every plugin. Generally, these will remain at their default values; it is the responsibility of a plugin implementor to ensure the default for that plugin is appropriate relative to any plugins it depends on. See `BasePlugin` for some guidelines. The mechanism for passing information between plugins is `~lsst.afw.table.SourceTable`\s slot system (see `lsst.afw.table.SlotDefinition`), in which particular measurements are given one of several predefined aliases (e.g. ``slotCentroid`` may be an alias for ``base_SdssCentroid``) which are used to implement getters on `lsst.afw.table.SourceRecord` (e.g `~lsst.afw.table.SourceRecord.getCentroid`). The measurement framewor's configuration defines which measurements are assigned to each slot, and these slot measurements are available to other plugins as soon as the plugin whose outputs are assigned to the slot is run. All this means that algorithms that need a centroid as input should simply call `~lsst.afw.table.SourceRecord.getCentroid` on the `lsst.afw.table.SourceRecord` with which they are provided, and ensure that their ``executionOrder`` is higher than that of centroid algorithms. Similarly, algorithms that want a shape should simply call `~lsst.afw.table.SourceRecord.getShape`. Things are a bit trickier for centroid algorithms, which often need to be given an approximate centroid as an input; these should be prepared to look at the `~lsst.afw.detection.Peak`\s attached to the `~lsst.afw.table.SourceRecord`\'s `lsst.afw.detection.Footprint` as an initial value, as the slot centroid may not yet be valid. For wrapped C++ algorithms, this is handled automatically. Forced photometry ================= In forced photometry, an external "reference" catalog is used to constrain measurements on an image. While parts of the forced photometry framework could be with a reference catalog from virtually any source, a complete system for loading the reference catalogs that correspond to the region of sky being measured is only available when measurements from a coadd are used as the reference. While essentially any measurement plugin can be run in forced mode, typically only photometric measurements are scientifically useful (though centroids and shapes may be useful for quality metrics). In fact, in forced mode we typically configure pseudo-measurements to provide the shape and centroid slots, and it is these --- rather than anything special about the forced measurement framework --- that constrains measurements. In particular, we generally use the `ForcedTransformedCentroidPlugin` and `ForcedTransformedShapePlugin` to provide the centroid and shape slots. Rather than measure the centroid and shape on the image, these simply transform the centroid and shape slots from the reference catalog to the appropriate coordinate system. This ensures that measurements that use these slots to obtain positions and ellipses use the same quantities used in generating the reference catalog. The core of the forced measurement framework is :lsst-task:`lsst.meas.base.forcedMeasurement.ForcedMeasurementTask` and `ForcedPlugin`, which broadly parallel :lsst-task:`lsst.meas.base.sfm.SingleFrameMeasurementTask` and `SingleFramePlugin`. The high-level algorithm is essentially the same, but with the `~lsst.afw.table.SourceCatalog` to be measured generated by `ForcedMeasurementTask.generateSources` from the reference catalog, rather than provided by the user after running detection. The corresponding reference source and the `~lsst.afw.geom.SkyWcs` objects that define the mapping between reference and measurement coordinate systems are also provided to each plugin. The fact that the sources to be measured are generated from the reference catalog means that the `lsst.afw.detection.Footprint`\s attached to these sources must be transformed from the reference coordinate system to the measurement coordinate system, and at present that operation turns "heavy" footprints (i.e., including pixel data) into regular `lsst.afw.detection.Footprint`\s. Heavy footprints for child sources are necessary in order to correctly replace neighboring children of the same parent with noise prior to measurement (see `NoiseReplacer`), and the lack of these means that deblended measurement in forced photometry is essentially broken, except for plugins that implement ``measureN`` and can hence correctly measure all children simultaneously without having to replace them with noise individually. In addition to the :lsst-task:`lsst.meas.base.forcedMeasurement.ForcedMeasurementTask` subtask and its plugins, the forced measurement framework also contains a pair of command-line driver tasks, :lsst-task:`lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask` and :lsst-task:`lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask`. These run forced measurement on CCD-level images and coadd patch images, respectively, using the outputs of a previous single-frame measurement run on coadds as the reference catalog in both cases. These delegate the work of loading (and as necessary, filtering and merging) the appropriate reference catalog for the measurement image to a ``references`` subtask. The interface for the reference subtask is defined by :lsst-task:`lsst.meas.base.references.BaseReferencesTask`, with the concrete implementation that utilizes coadd processing outputs in :lsst-task:`lsst.meas.base.references.CoaddSrcReferencesTask`. In general, to use a reference catalog from another source, one should implement a new references subtask, and reuse :lsst-task:`lsst.meas.base.forcedPhotCcd.ForcedPhotCcdTask` and/or :lsst-task:`lsst.meas.base.forcedPhotCoadd.ForcedPhotCoaddTask`. It should only be necessary to replace these and use :lsst-task:`lsst.meas.base.forcedMeasurement.ForcedMeasurementTask` directly if you need to run forced photometry on data that isn't organized by the Butler or doesn't correspond to CCD- or patch-level images. Implementing new plugins and algorithms ======================================= The "Plugin" interfaces used directly by the measurement tasks are defined completely in Python, and are rooted in the abstract base classes `SingleFramePlugin` and `ForcedPlugin`. There are also analogous C++ base classes, ``SingleFrameAlgorithm`` and ``ForcedAlgorithm``, for plugins implemented in C++, as well as ``SimpleAlgorithm``, a C++ base class for simple algorithms in which the same implementation can be used for both single-frame and forced measurement. For a single-frame plugin or algorithm: - Subclass `SingleFramePlugin` (Python) or ``SingleFrameAlgorithm`` (C++). - Implement an ``__init__`` method with the same signature as the base class, in which fields saved by the plugin should be added to the schema passed to ``__init__``, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by `wrapSingleFrameAlgorithm`. - Reimplement ``measure()`` to perform the actual measurement and save the result in the ``measRecord`` argument. - Reimplement ``fail()`` unless the plugin cannot fail (except for environment errors). - Reimplement ``measureN()`` if the plugin supports measuring multiple sources simultaneously. - Register the plugin with the config mechanism by calling e.g. `SingleFramePlugin.registry.register` at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and call `wrapSingleFrameAlgorithm` to wrap and register the algorithm simultaneously. For a forced plugin or algorithm: - Subclass `ForcedPlugin` (Python) or ``ForcedAlgorithm`` (C++). - Implement an ``__init__`` method with the same signature as the base class, in which fields saved by the plugin should be added to the ``outputSchema`` of the `~lsst.afw.table.SchemaMapper` passed to ``__init__``, with keys saved as instance attributes for future use. In C++, implement a constructor with one of the signatures supported by by `wrapForcedAlgorithm`. - Reimplement ``measure()`` (in Python) or ``measureForced`` (C++) to perform the actual measurement and save the result in the ``measRecord`` argument. Note that the ``refRecord`` and ``refWcs`` are available during the measurement if needed. - Reimplement ``fail()`` unless the plugin cannot fail (except for environment errors). - Reimplement ``measureN()`` (Python) or ``measureNForced()`` (C++) if the plugin supports measuring multiple sources simultaneously. - Register the plugin with the config mechanism by calling e.g. `ForcedPlugin.registry.register` at module scope (so the registration happens at import-time). Or, in C++, expose the algorithm with Pybind11 as you would any normal C++ class, and call `wrapForcedAlgorithm` to wrap and register the algorithm simultaneously. In C++, one can also implement both interfaces at the same time using `SimpleAlgorithm`; see that class for more information. Error handling ============== When a plugin (or the C++ algorithm it delegates to) raises any exception, the task calling it will catch the exception, and call the ``fail()`` method of the plugin, which should cause the plugin to set one or more flags in the output record. If the exception is a `MeasurementError`, the task will pass this exception back to the ``fail()`` method, as `MeasurementError` contains additional, plugin-specific, information indicating the kind of failure. For most other exceptions, the task will log the exception message as a warning, and pass `None` as the exception to ``fail()``. In this case, the plugin should just set the primary failure flag. This is handled automatically by the ``FlagHandler`` in C++-based plugins. Certain exceptions (in particular, out-of-memory errors) are considered fatal and will always be propagated up out of the task. Plugin/algorithm code should endeavor to only throw `MeasurementError` for known failure modes, unless the problem is in the data and can always be fixed there before the measurement framework is invoked. In other words, we want warnings to appear in the logs only when there's a bug, whether that's in the processing before the measurement framework or in a particular plugin/algorithm not knowing and documenting its own failure modes. This means that plugin/algorithm implementations should generally have a global try/catch block that re-throwns lower-level exceptions corresponding to known failure modes as `MeasurementError`\s. How plugin errors are logged ============================ A plugin is usually not run by itself, but as a component of a measurement task. The measurement task may also be a component or "subtask" of another task, and so on. When a plugin is run, the measurement task which is running the plugin logs any error which the plugin throws to a log location within the task hierarchy. For example, when the ``PsfFlux`` plugin from is run within :lsst-task:`lsst.pipe.tasks.processCcd.ProcessCcdTask`, its errors are logged to ``processCcd.charImage.measurement.base_PsfFlux`` This log hierarchy allows the log for the ``PsfFlux`` plugin to be controlled independently of the other plugins, and also independently of the measurement task log. Measurement errors are typically logged at the ``DEBUG`` level. When :lsst-task:`lsst.pipe.tasks.processCcd.ProcessCcdTask` is launched, you may selectively modify the log level of any level of the hierarchy:: processCcd.py -L processCcd.charImage.measurement=WARN will set the logging level of the measurement task and all of its plugins ``WARN``, whereas:: processCcd.py -L=processCcd.charImage.measurement.base_PsfFlux=DEBUG will selectively set just the ``PsfFlux`` algorithm to ``DEBUG``, leaving the task and the other plugins at their default log levels. How a plugin can get its log name ================================= Plugins which do not log internally do not need to know the name of their log. However, if you are writing a plugin and wish to have the plugin log messages to the log level described in the previous section, you must add the following to your plugin class (the class, not the instance): - The plugin class must have a class attribute named ``hasLogName``. - The class attribute ``hasLogName`` must be set to ``True``. - The class initializer must have a ``logName`` parameter. When all of these conditions are satisfied, the measurement task will initialize the plugin with the ``logName`` it uses to log error messages, to enable the plugin to log to the same location. The plugin may then use one of its base class methods, `BasePlugin.getLogName`, to get the name of its log. The plugin may then get the logger: .. code-block:: py logger = lsst.log.Log.getLogger(self.getLogName()) Setup for Python plugins which log ---------------------------------- Here is an example of how a Python plugin which logs internally can be constructed. Note the class attribute ``hasLogName`` and the initialization with an optional ``logName`` parameter. .. code-block:: py class SingleFrameTestPlugin(SingleFramePlugin): ConfigClass = SingleFrameTestConfig hasLogName = True def __init__(self, config, name, schema, metadata, logName=None): SingleFramePlugin.__init__(self, config, name, schema, metadata, logName=logName) With this configuration, the running task will set the ``logName`` parameter when the plugin is initialized, and the plugin's ``getLogName()`` method may subsequently be used to fetch it. Though it might be overly verbose, a plugin could log at the ``INFO`` level each time its ``measure()`` method is invoked, using the same logger as the measurement task: .. code-block:: py lsst.log.Log.getLogger(self.getLogName()).info("Staring a measurement.") Setup for C++ algorithms which log ---------------------------------- C++ algorithms which are called from Python tasks can also get the ``logName``. To do so, they must have an optional ``logName`` argument in their constructor. Here is an example from ``PsfFluxAlgorithm``: .. code-block:: cpp PsfFluxAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema, std::string const & logName = ""); The constructor should include the line: .. code-block:: cpp _logName = logName.size() ? logName : name; which sets the name to be used for the logger of this plugin to either ``logName``, or to the name of the base name of the plugin if the optional ``logName`` argument has not been specified. The following line in must also be added to allow this constructor to be accessed from Python. .. code-block:: cpp cls.def(py::init(), "ctrl"_a, "name"_a, "schema"_a, "logName"_a); And finally, the ``hasLogName=True`` must be added to the Python wrapper: .. code-block:: py wrapSimpleAlgorithm(PsfFluxAlgorithm, Control=PsfFluxControl, TransformClass=PsfFluxTransform, executionOrder=BasePlugin.FLUX_ORDER, shouldApCorr=True, hasLogName=True) This constructor allows the algorithm to receive the ``logName`` as an optional string, which can later be accessed by its ``getLogName()`` method. In C++, an algorithm may then log to this ``logName`` as follows: .. code-block:: cpp LOGL_INFO(logger, message...) where ``logger`` can either be the ``logName`` string itself, or a `lsst.log.Log` object returned by .. code-block:: py logger = lsst.log.Log.getLogger(getLogName()); Using a `FlagHandler` with Python plugins ========================================= Review the `SingleFramePlugin` requirements for the ``measure()`` and ``fail()`` methods, which a plugin must implement. When the plugin detects an error, it should raise a `MeasurementError`, which triggers a call to ``fail()``. The ``fail()`` method should set the appropriate failure flags in the output catalog. A `FlagHandler` is a convenient way for a plugin to define flags for different error conditions, and to automatically set the correct flags when they occur. How to define a `FlagHandler` in Python --------------------------------------- The meas_base plugins implemented in C++ use :cpp:class:`lsst::meas::base::FlagHander` to handle measurement exceptions. In Python, measurement plugins may use a `FlagDefinitionList` to create an instance of this class. First examine the code testFlagHandler.py in the meas_base tests directory. This unit test defines a Python plugin which illustrates the use of the FlagHander. The __init__ method shown below defines a list of 3 failure flags. As each flag is added, a FlagDefinition is returned which can be used to identify the error later. When the list is complete, a FlagHandler is created, which initializes the flag fields in the output catalog: During initialization, a Python plugin may create a `FlagHandler` with code like the following in ``__init__()``: .. code-block:: py flagDefs = FlagDefinitionList() FAILURE = flagDefs.add("flag", "General Failure error") CONTAINS_NAN = flagDefs.add("flag_containsNan", "Measurement area contains a nan") EDGE = flagDefs.add("flag_edge", "Measurement area over edge") self.flagHandler = FlagHandler.addFields(schema, name, flagDefs) This code defines the following error flags: - a general failure flag which indicates that something has gone wrong during ``measure()`` (``flag``); - a specific failure flag to indicate a source which contains one or more NaNs (``flag_containsNan``); - a specific failure flag which indicates that the source is too close to the edge to be measured (``flag_edge``). The `FlagHander` thus created is then used to implement the ``fail()`` method. Recall that you must implement this method in your plugin class if your method can fail. The following addition to your class will correctly implement the error handling: .. code-block:: py def fail(self, measRecord, error=None): if error is None: self.flagHandler.handleFailure(measRecord) else: self.flagHandler.handleFailure(measRecord, error.cpp) When the error is one which your plugin code expects, your code will raise a `MeasurementError` exception. The error is sent to the ``fail()`` method as its ``error`` argument. The error will indicate which flag should be set in addition to the general failure flag. If the error argument is not supplied, as when the failure is not expected, only the general failure flag will be set. For example, if the following code condition was encountered during measurement: .. code-block:: py if not exposure.getBBox().contains(bbox): raise MeasurementError(self.EDGE.doc, self.EDGE.number) then both the general ``flag`` and the more specific ``flag_edge`` will be set to ``True``. Using the `SafeCentroidExtractor` --------------------------------- The `SafeCentroidExtractor` may be used to fetch a value for the centroid of a source even if the centroiding algorithm has failed for that source. This is achieved by falling back to the original detection `~lsst.afw.detection.Footprint` for the source. The `SafeCentroidExtractor` is defined in the ``__init__`` method of the measurement plugin: .. code-block:: py def __init__(self, config, name, schema, metadata): SingleFramePlugin.__init__(self, config, name, schema, metadata) self.centroidExtractor = lsst.meas.base.SafeCentroidExtractor(schema, name) ``self.centroidExtractor()`` may then be called to fetch the centroid for a record: .. code-block:: py center = self.centroidExtractor(measRecord, self.flagHandler) The `SafeCentroidExtractor` will first try to read the centroid from the centroid slot. If the failure flag on the centroid slot has been set to ``True``, it will try to use the detection `~lsst.afw.table.Footprint` to determine the centroid. This might potentially allow the plugin to complete its measurement if the centroid provided is adequate. To indicate at the same time that something has gone wrong, the general flag will automatically get set on this source. The `SafeCentroidExtractor` will also create a flag called ``flag_badCentroid`` which points to the centroid slot failure flag, and can be used to distinguish records where the failure flag has been set because the centroid slot measurement was bad.