lsst-pipe-base 27.0.0 (2024-05-29)¶
New Features¶
- Added a manifest checker which walks an executed quantum graph to generate a summary report containing information about produced dataset types, missing data, and failures. (DM-37163) 
- Updated the open-source license to allow for the code to be distributed with either GPLv3 or BSD 3-clause license. (DM-37231) 
- Rewrote quantum graph generation. - The new algorithm is much faster, more extensible, and easier to maintain (especially when storage-class conversions are present in a pipeline). It also allows - PipelineTasksto raise- NoWorkFoundor otherwise restrict their outputs during quantum-graph generation and immediately affect the downstream graph. (DM-38498)
- Added a new subpackage, - lsst.pipe.base.pipeline_graph, for text-art visualization of pipeline graphs. (DM-39779)
- Added an option to the interface for creating subsets of whole pipelines which allows control over how named subsets within the pipeline are modified when labels are missing from the new subsetted pipeline. The previous behavior is the new default, that is to drop any named subsets within the pipeline that contain a task label for which there is no task with that label defined. The new option is to to edit each named subset to remove the extra label from the named subset, but otherwise leaving it in the new subsetted pipeline. The interface has been modified in - Pipelineand also the lower level- PipelineIR, though the latter should rarely be used directly. The new argument is implemented as an enum option, and can be most easily accessed from the- Pipelineclass as- Pipeline.PipelineSubsetCtrl.(DROP/EDIT). This interface is available through YAML pipeline specification by specifying the- labeledSubsetModifyModekey when writing YAML import defectives.- New Python interfaces were added for manipulating labeled subsets in a pipeline. These include; - Pipeline.subsetswhich is a property returning a- dict`of subset labels to sets of task labels,- Pipeline.addLabeledSubsetto add a new labeled subset to a- Pipeline, and- Pipeline.removeLabeledSubsetto remove a labeled subset from a pipeline. (DM-41203)
- Added - QuantumGraphsummary. (DM-41542)
- Added human-readable option to report summary dictionaries. (DM-41606) 
- Added a section to pipelines which allows the explicit declaration of which susbsets correspond to steps and the dimensions the step’s quanta can be sharded with. (DM-41650) 
- The - butler transfer-from-graphcommand now supports a- --dry-runoption to allow the transfer to run without updating the target butler. (DM-42306)
- Added - TaskMetadata.get_dictand- set_dictmethods.- These provide a consistent way to assign and extract nested dictionaries from - TaskMetadata,- lsst.daf.base.PropertySet, and- lsst.daf.base.PropertyList. (DM-42928)
- Added - CachingLimitedButleras a new type of- LimitedButler.- A - CachingLimitedButlercaches on both- .put()and- .get(), and holds a single instance of the most recently used dataset type for that put/get.- The dataset types which will be cached on put/get are controlled via the - cache_on_putand- cache_on_getattributes, respectively.- By default, copies of the cached items are returned on - get, so that code is free to operate on data in-place. A- no_copy_on_cacheattribute also exists to tell the- CachingLimitedButlernot to return copies when it is known that the calling code can be trusted not to change values, e.g., when passing calibs to- isrTask. (DM-43060)
- QuantumGraphgeneration now saves software stack versions in the graph’s metadata. (DM-43225)
- Added support for testing transient error recovery logic to the - PipelineTaskmock system. (DM-43484)
- Added - deferBindingattribute to- Inputconnection, which allows us to have an input connection with the same dataset type as an output. (DM-43572)
API Changes¶
- Deprecated various interfaces that have been obsoleted by - PipelineGraph.- The most prominent deprecations are: - the - Pipeline.toExpandedPipeline, as well as iteration and task-label indexing for- Pipeline;
- the - PipelineDatasetTypesand- TaskDatasetTypesclasses;
- the old - GraphBuilderinterface for building- QuantumGraphobjects. (DM-40441)
 
- Modified the - Instrumentconstructors to be class methods rather than static methods. This means that when you call- Subclass.from_string()the returned instrument class is checked to make sure it is a subclass of- Subclassand not just a subclass of- Instrument. (DM-42636)
Bug Fixes¶
- Fixed bug in pipeline mocking triggered by declaring a config as an input connection. (DM-41191) 
- Fixed bug in - QuantumGraphgeneration triggered by an- adjustQuantumthat modifies input edges when prerequisite input edges are present on that quantum. (DM-41486)
- Fixed bug in meta class compatibility between Python versions for - DatasetQueryConstraints(DM-41853)
- Fixed bug in - DatasetTypeExecutionReportin which extra steps led to miscategorization. The “outputs” section of- pipetask reportshould be correct now. (DM-41898)
- Fixed a QG generation bug involving unusual combinations of dimensions and calibration datasets. (DM-42301) 
- Fixed an incorrect count of previously-successful quanta in - QuantumGraphBuilderlogging. (DM-42737)
- Fixed component-dataset query bug in execution reports. (DM-42954) 
- Replaced failing - QuantumGraphpackages equality check with a weaker test. (DM-43538)
- Propagated - subsetCtrlinto- subset_from_labelswithin the- subsetFromLabelspipeline method. (DM-44341)
Other Changes and Additions¶
An API Removal or Deprecation¶
- Removed - topLevelOnlyparameter from- TaskMetadata.names().
- Removed the - saveMetadataconfiguration from- PipelineTask.
- Removed - lsst.pipe.base.cmdLineTask.profile(use- lsst.utils.timer.profileinstead).
- Removed - ButlerQuantumContextclass. Use- QuantumContextinstead.
- Removed - recontitutedDimensionsparameter from- QuantumNode.from_simple()(DM-40150)
lsst-pipe-base v26.0.0 (2023-09-22)¶
New Features¶
- Added system for obtaining data ID packer objects from the combination of an - Instrumentclass and configuration. (DM-31924)
- Added a - PipelineGraphclass that represents a Pipeline with all configuration overrides applied as a graph. (DM-33027)
- Added new command - butler transfer-from-graphto transfer results of execution with Quantum-backed butler. (DM-33497)
- buildExecutionButlermethod now supports input graph with all dataset references resolved. (DM-37582)
- Added convince methods to the Python api for Pipelines. These methods allow merging pipelines, adding labels to / removing labels from subsets, and finding subsets containing a specified label. (DM-37655) 
- An - Instrumentcan now specify the dataset type definition that it would like to use for raw data. This can be done by setting the- raw_definitionclass property to a tuple of the dataset type name, the dimensions to use for this dataset type, and the storage class name. (DM-37950)
- Modified - InMemoryDatasetHandleto allow it to be constructed with keyword arguments that will be converted to the relevant DataId. (DM-38091)
- Modified - InMemoryDatasetHandleto allow it to be configured to always deep copy the Python object on- get(). (DM-38694)
- Revived bit-rotted support for “mocked” - PipelineTaskexecution and moved it here (from- ctrl_mpexec). (DM-38952)
- Formalized support for modifying connections in - PipelineTaskConnections.__init__implementations.- Connections can now be added, removed, or replaced with normal attribute syntax. Removing entries from e.g. - self.inputsin- __init__still works for backwards compatibility, but deleting attributes is generally preferred. The task dimensions can also be replaced or modified in place in- __init__. (DM-38953)
- Added a method on - PipelineTaskConfigobjects named- applyConfigOverrides. This method is called by the system executing- PipelineTasks within a pipeline, and is passed the instrument and config overrides defined within the pipeline for that task. (DM-39100)
- Add - Instrument.make_default_dimension_packerto restore simple access to the default data ID packer for an instrument. (DM-39453)
- The back-end to quantum graph loading has been optimized such that duplicate objects are not created in memory, but create shared references. This results in a large decrease in memory usage, and decrease in load times. (DM-39582) 
- A new class - ExecutionResourceshas been created to record the number of cores and memory that has been allocated for the execution of a quantum.
- QuantumContext(newly renamed from- ButlerQuantumContext) now has a- resourcesproperty that can be queried by a task in- runQuantum. This can be used to tell the task that it can use multiple cores or possibly should make a more efficient use of the available memory resources. (DM-39661)
 
- Made it possible to deprecate - PipelineTaskconnections. (DM-39902)
- Parameters defined in a Pipeline can now be used within a config Python block as well as within config files loaded by a Pipeline. (DM-40198) 
- When looking up prerequisite inputs with skypix data IDs (e.g., reference catalogs) for a quantum whose data ID is not spatial, use the union of the spatial regions of the input and output datasets as a constraint. - This keeps global sequence-point tasks from being given all such datasets in the input collections. (DM-40243) 
- Added support for init-input/output datasets in PipelineTask mocking. (DM-40381) 
API Changes¶
- Several changes to API to add support for - QuantumBackedButler:- Added a - globalInitOutputRefsmethod to the- QuantumGraphclass which returns global per-graph output dataset references (e.g. for “packages” dataset type).
- ButlerQuantumContextcan work with either- Butleror- LimitedButler. Its- __init__method should not be used directly, instead one of the two new class methods should be used -- from_fullor- from_limited.
- The - ButlerQuantumContext.registryattribute was removed, and- ButlerQuantumContext.dimensionshas been added to hold the- DimensionUniverse.
- The abstract method - TaskFactory.makeTaskwas updated and simplified to accept- TaskDefand- LimitedButler. (DM-33497)
 
- ButlerQuantumContextwas updated to only need a- LimitedButler.
- Factory methods - from_fulland- from_limitedwere dropped, a constructor accepting a- LimitedButlerinstance is now used to make instances. (DM-37704)
 
- Added method - QuantumGraph.updateRun. This new method updates run collection name and dataset IDs for all output and intermediate datasets in a graph, allowing the graph to be reused.
- GraphBuilder.makeGraphmethod dropped the- resolveRefsargument, the builder now always makes resolved references. The- runargument is now required to be non-empty string. (DM-38780)
 
Bug Fixes¶
- Fixed a bug that led to valid storage class conversions being rejected when using execution butler. (DM-38614) 
- Fixed a bug related to checking component datasets in execution butler creation, introduced in DM-38614. (DM-38888) 
- Fixed handling of storage classes in - QuantumGraphgeneration.- This could lead to a failure downstream in execution butler creation, and would likely have led to problems with Quantum-Backed Butler usage as well. (DM-39198) 
- Fixed a bug in - QuantumGraphgeneration that could result in datasets from- skip_existing_incollections being used as outputs, and another that prevented- QuantumGraphgeneration when a- skip_existing_incollection has some outputs from a failed quantum. (DM-39672)
- Fixed a bug in quantum graph builder which resulted in missing datastore records for calibration datasets. This bug was causing failures for - pipetaskexecution with quantum-backed butler. (DM-40254)
- Ensured QuantumGraphs are built with datastore records for init-input datasets that might have been produced by another task in the pipeline, but will not be because all quanta for that task were skipped due to existing outputs. (DM-40381) 
- QuantumGraph.updateRun()method was fixed to update dataset ID in references which have their run collection changed. (DM-40392)
Other Changes and Additions¶
- Modified the calling signature for the - Taskconstructor such that only the- configparameter can be positional. All other parameters must now be keyword parameters. (DM-15325)
- The - Structclass is now a subclass of- SimpleNamespace. (DM-36649)
- The - DuplicateOutputErrorlogger now produces a more helpful error message. (DM-38234)
- Execution butler creation has been changed to use the - DatasetRefsfrom the graph rather than creating new registry entries from the dataIDs. This is possible now that the graph is always created with resolved refs and ensures that provenance is consistent between the graph and the outputs.
- This change to execution butler required that - ButlerQuantumContext.put()no longer unresolves the graph- DatasetRef(otherwise there would be a dataset ID mismatch). This results in the dataset always using the output run defined in the graph even if the Butler was created with a different default run. (DM-38779)
 
- Stopped sorting Pipeline elements on read. - Ordering specified in pipeline files is now preserved instead. (DM-38953) 
- Loosened documentation of - QuantumGraph.inputQuantaand- outputQuanta. They are not guaranteed to be (and currently are not) lists, so the new documentation describes them as iterables.- Documented - universeconstructor parameter to- QuantumGraph.- Brought - QuantumGraphproperty docs in line with DM standards.
An API Removal or Deprecation¶
- Removed deprecated kwargs parameter from in-memory equivalent dataset handle. 
- Removed deprecated - pipe_base- timermodule (it was moved to- utils).
- Removed the warning from deprecated - PipelineIR._read_importsand replaced with a raise.
- Removed the warning from deprecated - Pipeline._parse_file_specifierand replaced with a raise.
- Removed deprecated methods from - TaskMetadata. (DM-37534)
 
- The - PipelineTaskConfig.saveMetadatafield is now deprecated and will be removed after v26. Its value is ignored and task metadata is always saved.
- The - ResourceConfigclass has been removed; it was never used. (DM-39377)
 
- Deprecated the - reconstituteDimensionsargument from- QuantumNode.from_simple(DM-39582)
- ButlerQuantumContexthas been renamed to- QuantumContext. This reflects the additional functionality it now has. (DM-39661)
- Removed support for reading quantum graphs in pickle format. (DM-40032) 
lsst-pipe-base v25.0.0 (2023-02-28)¶
This is the first release without any support for the Generation 2 middleware.
New Features¶
- Added - PipelineStepTesterclass, to enable testing that multi-step pipelines are able to run without error. (DM-33779)
- QuantumGraphnow saves the- DimensionUniverseit was created with when it is persisted. This removes the need to explicitly pass the- DimensionUniversewhen loading a saved graph. (DM-35082)
- Added support for transferring files into execution butler. (DM-35494) 
 
- A new class - InMemoryDatasetHandleis now available. This class provides a variant of- lsst.daf.butler.DeferredDatasetHandlethat does not require a butler and lets you store your in-memory objects in something that looks like one and so can be passed to- Task.run()methods that expect to be able to do deferred loading. (DM-35741)
- Add unit test to cover the new - getNumberOfQuantaForTaskmethod.
- Add graph interface, - getNumberOfQuantaForTask, to determine number of quanta associated with a given- taskDef.
- Modifications to - getQuantaForTaskto support showing added additional quanta information in the logger. (DM-36145)
 
- Allow - PipelineTasksto provide defaults for the- --dataset-query-constraintsoption for the- pipetasktool. (DM-37786)
API Changes¶
- ButlerQuantumContext.getmethod can accept- Noneas a reference and returns- Noneas a result object. (DM-35752)
- GraphBuilder.makeGraphmethod adds- bindparameter for bind values to use with the user expression. (DM-36487)
- InMemoryDatasetHandlenow supports storage class conversion on- get(). (DM-4551)
Bug Fixes¶
- lsst.pipe.base.testUtils.makeQuantumno longer crashes if given a connection that is set to a dataset component. (DM-35721)
- Ensure - QuantumGraphsare given a- DimensionUniverseat construction.- This fixes a mostly-spurious dimension universe inconsistency warning when reading QuantumGraphs, introduced on DM-35082. (DM-35681) 
- Fixed an error message that says that repository state has changed during - QuantumGraphgeneration when init input datasets are just missing. (DM-37786)
Other Changes and Additions¶
- Make diagnostic logging for empty - QuantumGraphsharder to ignore.- Log messages have been upgraded from - WARNINGto- FATAL, and an exception traceback that tends to hide them has been removed. (DM-36360)
An API Removal or Deprecation¶
- Removed the - Task.getSchemaCatalogsand- Task.getAllSchemaCatalogsAPIs. These were used by- CmdLineTaskbut are no longer used in the current middleware. (DM-2850)
- Relocated - lsst.pipe.base.cmdLineTask.profileto- lsst.utils.timer.profile. This was relocated as part of the Gen2 removal that includes the removal of- CmdLineTask. (DM-35697)
- ArgumentParser,- CmdLineTask, and- TaskRunnerclasses have been removed and associated gen2 documentation.
- The - PipelineIR.from_file()method has been removed.
- The - getTaskLoggerfunction has been removed. (DM-35917)
 
- Replaced - CmdLineTaskand- ArgumentParserwith non-functioning stubs, disabling all Gen2 functionality. A deprecation message is now issued but the classes do nothing. (DM-35675)
lsst-pipe-base v24.0.0 (2022-08-26)¶
New Features¶
- Add the ability for user control over dataset constraints in - QuantumGraphcreation. (DM-31769)
- Builds using - setuptoolsnow calculate versions from the Git repository, including the use of alpha releases for those associated with weekly tags. (DM-32408)
- Improve diagnostics for empty - QuantumGraph. (DM-32459)
- A new class has been written for handling - Taskmetadata.- lsst.pipe.base.TaskMetadatawill in future become the default metadata class for- Task, replacing- lsst.daf.base.PropertySet. The new metadata class is not yet enabled by default. (DM-32682)
- Add - TaskMetadata.to_dict()method (this is now used by the- lsst.daf.base.PropertySet.from_mapping()method and triggered by the Butler if type conversion is needed).
- Use the existing metadata storage class definition if one already exists in a repository. 
- Switch - Taskto use- TaskMetadatafor storing task metadata, rather than- lsst.daf.base.PropertySet. This removes a C++ dependency from the middleware. (DM-33155)
 
- Added - lsst.pipe.base.Instrumentto represent an instrument in Butler registry.
- Added - butler register-instrumentcommand (relocated from- obs_base).
 
Bug Fixes¶
- Fixed a bug where imported pipeline parameters were taking preference over “top-level” preferences (DM-32080) 
Other Changes and Additions¶
- If a - PipelineTaskhas connections that have a different storage class for a dataset type than the one defined in registry, this will now be allowed if the storage classes are compatible. The- Task- run()method will be given the Python type it expects and can return the Python type it has declared it returns. The Butler will do the type conversion automatically. (DM-33303)
- Topological sorting of pipelines on write has been disabled; the order in which the pipeline tasks were read/added is preserved instead. This makes it unnecessary to import all tasks referenced by the pipeline in order to write it. (DM-34155) 
lsst-pipe-base v23.0.1 (2022-02-02)¶
Miscellaneous Changes of Minor Interest¶
- Execution butler creation time has been reduced significantly by avoiding unnecessary checks for existence of files in the datastore. (DM-33345) 
lsst-pipe-base v23.0.0 (2021-12-10)¶
New Features¶
- Added a new facility for creating “lightweight” (execution) butlers that pre-fills a local SQLite registry. This can allow a pipeline to be executed without talking to the main registry. (DM-28646) 
- Allow - PipelineTasksinputs and outputs to be optional under certain conditions, so tasks with no work to do can be skipped without blocking downstream tasks from running. (DM-30649)
- Log diagnostic information when QuantumGraphs are empty because the initial query yielded no results. - At present, these diagnostics only cover missing input datasets, which is a common way to get an empty QuantumGraph, but not the only way. (DM-31583) 
API Changes¶
- GraphBuilderconstructor boolean argument- skipExistingis replaced with- skipExistingInwhich accepts collections to check for existing quantum outputs. (DM-27492)
Other Changes and Additions¶
- The logger associated with - Taskis now derived from a Python- logging.Loggerand not- lsst.log.Log. This logger includes a new- verbose()log method as an intermediate between- INFOand- DEBUG. (DM-30301)
- Added metadata to QuantumGraphs. This changed the on disk save format, but is backwards compatible with graphs saved with previous versions of the QuantumGraph code. (DM-30702) 
- All Doxygen documentation has been removed and replaced by Sphinx. (DM-23330) 
- New documentation on writing pipelines has been added. (DM-27416) 
lsst-pipe-base v22.0 (2021-04-01)¶
New Features¶
- Add ways to test a PipelineTask’s init inputs/outputs [DM-23156] 
- Pipelines can now support URIs [DM-28036] 
- Graph files can now be loaded and saved via URIs [DM-27682] 
- A new format for saving graphs has been developed (with a - .qgraphextension). This format supports the ability to read a subset of a graph from an object store. [DM-27784]
- Graph building with a pipeline that specifies an instrument no longer needs an explicit instrument to be given. [DM-27985] 
- A - parameterssection has been added to pipeline definitions. [DM-27633]