######################### Deblending Flags Overview ######################### When performing any analysis on an `lsst.afw.table.SourceCatalog` it is important to understand the different subsets of sources available to ensure that the sources being analyzed are unique and to understand the biases associated with them. The aim of this document is to help you understand the different collections of sources generated by the deblender and the flags available to select them, with a little background into where the source come from and why these flags exist. Quick reference =============== Below is a lookup table to show the relationship between all of the different flags and subset of a `~lsst.afw.table.SourceCatalog`, placed here to make it easy to find in the future after you have read and understood the remainder of this page. .. list-table:: Deblending flags. :header-rows: 1 :align: center * - Datasets - isPatchInner - isTractInner - isIsolated - fromBlend - isDeblendedSource - isDeblendedModelSource - isPrimary * - isolated sources - x - x - x - - x - - x * - parents with multiple peaks - x - x - - - - - * - models of isolated sources - x - x - x - - - x - * - models of deblended sources - x - x - - x - x - x - x * - sky objects - x - x - x - x - x - x - * - patch overlap regions - - x - x - x - x - x - * - tract overlap regions - x - - x - x - x - x - Deblender input =============== The input ``mergeDet`` catalog to the deblender contains a list of parent sources, each row consisting of information about the parent, a footprint (a boolean mask of pixels in the input image that were detected as part of the parent blend), and a peak catalog (a list of peak locations and central flux values for all detected peaks in the parent blend). From here things change depending on which deblender is used, as single visit deblending uses the single-band :doc:`/modules/lsst.meas.deblender` while multi-band deblending uses :doc:`lsst.meas.extensions.scarlet` and generates a slightly different set of outputs. Single-band deblender ===================== Prior to the adoption of scarlet as the default deblender for coadds, all deblending was done with `lsst.meas.deblender.SourceDeblendTask`, which is still the single visit (single band) deblender. The basic idea of the algorithm is that most galaxies roughly exhibit 180 degree symmetry and (at shallow depths) are only blended with one other object in most cases. So a symmetric template (numerically equivalent to ``x = np.min([x, x[::-1]])``) is made for all of the sources in a blend that do not fit the PSF, and the flux in the image is re-apportioned to each source based on the ratio of the templates for each pixel. This algorithm can fail for non-symmetric sources and for any blends where a source is blended with neighbors on both sides, causing inaccuracies in the measurement of all three sources. As the depth of the images increases blending becomes more severe and the instances where this algorithm fails increases, which is why a different deblender is used for co-added images in multiple bands. Because the templates generated by the deblender are used to weight the flux from the image, the total flux in the footprints is conserved. This means that for isolated sources, the template that would be created by `lsst.meas.deblender.SourceDeblendTask` is irrelevant, as all of the flux in the footprint would be returned. The result is an output catalog (in each band) with all of the parent blends at the top followed by all of the children deblended from one of the parents. So when this was the only deblender, selecting a set of unique sources was easy, you just cut on ``deblend_nChild == 0``. This selected all of the isolated sources (from the parent section) and all of the deblended child sources. Multi-band deblender (scarlet) ============================== `lsst.meas.extensions.scarlet.ScarletDeblendTask` is different, as it uses `scarlet `__ to create a *model* for each source in a blend. This is a philosophically different object, as there is no longer an assumption that all of the flux in the input image will be modeled by one of the children in the blend (this may change in the near future, but this is the current implementation). The results of the scarlet deblender will be biased by the assumptions that went into making the models, so it was decided that it would be a good idea to (by default) also model all of the isolated sources. This will allow comparisons of scarlet models of isolated sources to the un-modeled isolated source measurements to investigate the biases that scarlet is introducing and also gives users the option to choose between the un-modeled (parent) isolated source records and the scarlet model version of each isolated source. However this flexibility forced a change in the way that you select unique objects in a source catalog. Flags set by the deblender ========================== In addition to the flags set by :lsst-task:`~lsst.pipe.tasks.SetPrimaryFlagsTask` it is useful to understand the flags that are set in `lsst.meas.deblender.SourceDeblendTask` and `lsst.meas.extensions.ScarletDeblendTask` that relate to source selection. - ``parent``: the id in the catalog for the parent of this source record. This is actually set pre-deblender, where all top level records have `parent=0`. - ``deblend_nPeaks``: the number of peaks contained in the sources footprint. - ``deblend_nChild``: the number of peaks deblended by the deblender from this source and created as new source records in the catalog. This is different from ``deblend_nPeaks`` in that isolated sources that are not deblended by `lsst.meas.deblender.SourceDeblendTask` and child peaks that were culled during deblending are not included in this count. - ``deblend_parentNPeaks``: The number of peaks contained in the parent of this source record. - ``deblend_parentNChild``: the number of children deblended from the parent of this source record. isPrimary and other flags added in lsst.pipe.tasks ================================================== In addition to source records for deblended parents and multiple entries for isolated sources, output catalogs are also not unique because they may contain "pseduo" sources (eg. sky objects that have been added to assist with calibration but are not output sources) and, if the analysis is done over multiple patches and/or tracts, sources in the overlap region can exist in multiple overlapping patches (but always on the interior of only one). For this reason the :lsst-task:`~lsst.pipe.tasks.SetPrimaryFlagsTask` task sets a number of useful flags to help you determine a unique output catalog for your analysis. detect\_isPatchInner and detect\_isTractInner --------------------------------------------- ``True`` when: - A source is in the inner region of a patch - A source is in the inner region of a tract The ``detect_isPatchInner`` and ``detect_isTractInner`` flags are used to identify sources that are contained in the interior region of a patches (and tracts). By definition every point in the sky is located on the interior of a patch and tract, however they also include an outer region that overlaps with neighboring patches/tracts. Sources with a ``False`` value for either flag are included in the overlap region and will show up multiple times in a combined catalog. In practice it would be useful to have a more clever algorithm for choosing which source to use on the edge of a patch/tract, since some sources will be cutoff, however these flags give a quick way to ensure that a catalog using multiple tracts/patches is unique. So an easy way to get unique sources is to select all of the sources with ``detect_isPatchInner==True & detect_isTractInner==True``. sky\_source and merge\_peak\_sky -------------------------------- ``True`` when: - A source is flagged as a ``sky_source`` in a single visit catalog or - A source is flagged as ``merge_peak_sky`` in a ``mergeDet`` coadd catalog. ``sky_source`` is a flag in a single visit catalog to mark sky objects while ``merge_peak_sky`` is the coadd version (which states that a source was a sky object in at least one band). Any sources with either of these flags set should be ignored in a final source catalog as they are not astrophysical objects. detect\_isIsolated ------------------ ``True`` when: - A source only has a single peak (``deblend_nPeaks == 1``) - A source is a top level parent (``parent == 0``) or its parent only had a single peak (``deblend_parentNPeaks == 1``) The ``detect_isIsolated`` flag marks sources that are not contained in a blend. This covers both isolated sources that are not modeled by the deblender (parents) and (in cases where the multi-band deblender is used) scarlet models of the isolated sources. Note that cutting on this flag will *not* give a unique set of sources, but can be useful for selecting all of the isolated sources to analyze the differences between measurements made on scarlet models and measurements made on the same isolated sources. detect\_fromBlend ----------------- ``True`` when: - A source is deblended from a parent that had multiple children (``deblend_parentNChild > 1``) The ``detect_fromBlend`` flag is used to mark sources that were deblended from a parent that contained multiple children. This is *not* the opposite of `detect_isIsolated` because it does not contain parents that were deblended into multiple sources. detect_isDeblendedSource ------------------------ ``True`` when: - The source is a top level parent and it is isolated ``(detect_isIsolated & parent==0)`` or - The source was deblended from a parent with multiple children and has no children of its own ``(detect_fromBlend & deblend_nPeaks == 1)`` Current testing shows that the un-modeled isolated source measurements perform (perhaps unsurprisingly) better than the scarlet models of isolated sources in most cases, so the default set of unique sources uses the unmodeled (parent) isolated sources and scarlet models for sources in blends with multiple children. These sources are identified using the ``detect_isDeblendedSource`` flag, which is equivalent to ``(detect_isIsolated & parent==0) | (detect_fromBlend & deblend_nPeaks == 1)``. Checking that deblended sources only have a single peak in their footprints allows for potential hierarchical deblending in the future, where there may be several different hierarchies of deblended sources. detect\_isDeblendedModelSource ------------------------------ ``True`` when: - The source is not a top level parent (``parent != 0``) - The source does not have any children (``deblend_nPeaks == 1``) The ``detect_isDeblendedModelSource`` flag only exists when the multi-band deblender is used, marking sources that were deblended from a parent. This includes both isolated sources that were modeled by scarlet and sources deblended from a parent with multiple child peaks. If your preference is to always use the scarlet model to ensure that the isolated and deblended sources have the same underlying models, then joining on ``detect_isDeblendedModelSource & detect_isPatchInner & detect_isTractInner & ~merge_sky_peak`` will give a unique set of sources that is the equivalent of ``detect_isPrimary``, only using the scarlet isolated models as opposed to the un-modeled isolated source records. detect\_isPrimary ----------------- ``True`` when: - A source is located on the interior of a patch and tract (``detect_isPatchInner & detect_isTractInner``) - A source is *not* a sky object (``~merge_peak_sky`` for coadds or ``~sky_source`` for single visits) - A source is either an isolated parent that is un-modeled or deblended from a parent with multiple children (``isDeblendedSource``) The ``detect_isPrimary`` flag can be thought of as a flag to include the most common catalog of unique sources that users will want to make measurements on. However it is advised that users understand the assumptions made in using sources marked with this flag and whether or not it suits their needs.