Slurm¶
- class lsst.ctrl.bps.parsl.sites.Slurm(config: BpsConfig, add_resources: bool = False)¶
Bases:
SiteConfigConfiguration for generic Slurm cluster.
This can be used directly as the site configuration for a Slurm cluster by setting the BPS config, e.g.:
computeSite: slurm site: slurm: class: lsst.ctrl.bps.parsl.sites.Slurm nodes: 3 cores_per_node: 20 walltime: "00:59:00" # Note: always quote walltime in YAML
Alternatively, it can be used as a base class for Slurm cluster configurations.
The following BPS configuration parameters are recognised (and required unless there is a default mentioned here, or provided by a subclass):
nodes(int): number of nodes for each Slurm job.cores_per_node(int): number of cores per node for each Slurm job; by default we use all cores on the node.walltime(str): time limit for each Slurm job.mem_per_node(int): memory per node (GB) for each Slurm job; by default we use whatever Slurm gives us.qos(str): quality of service to request for each Slurm job; by default we use whatever Slurm gives us.singleton(bool): allow only one job to run at a time; by defaultFalse.account(str): account to use for Slurm jobs.scheduler_options(str): text to prepend to the Slurm submission script (each line usually starting with#SBATCH).
Methods Summary
from_config(config)Get the site configuration nominated in the BPS config.
Return the IP address of the machine hosting the driver/submission.
Return command(s) to add before each job command.
Get a list of executors to be used in processing.
Get parsl monitor.
Get Parsl configuration for this site.
get_site_subconfig(config)Get BPS configuration for the site of interest.
make_executor(label, *[, nodes, ...])Return an executor for running on a Slurm cluster.
select_executor(job)Get the
labelof the executor to use to execute a job.Methods Documentation
- classmethod from_config(config: BpsConfig) SiteConfig¶
Get the site configuration nominated in the BPS config.
The
computeSite(str) value in the BPS configuration is used to select a site configuration. The site configuration class to use is specified by the BPS configuration assite.<computeSite>.class(str), which should be the fully-qualified name of a python class that inherits fromSiteConfig.- Parameters:
- config
BpsConfig BPS configuration.
- config
- Returns:
- site_configsubclass of
SiteConfig Site configuration.
- site_configsubclass of
- get_address() str¶
Return the IP address of the machine hosting the driver/submission.
This address should be accessible from the workers. This should generally by the return value of one of the functions in
parsl.addresses.This is used by the default implementation of
get_monitor, but will generally be used byget_executorstoo.This default implementation gets the address from the hostname, but that will not work if the workers don’t access the driver/submission node by that address.
- get_command_prefix() str¶
Return command(s) to add before each job command.
These may be used to configure the environment for the job.
This default implementation respects the BPS configuration elements:
- get_executors() list[parsl.executors.base.ParslExecutor]¶
Get a list of executors to be used in processing.
Each executor should have a unique
label.
- get_monitor() MonitoringHub | None¶
Get parsl monitor.
The parsl monitor provides a database that tracks the progress of the workflow and the use of resources on the workers.
This implementation respects the BPS configuration elements:
- get_parsl_config() Config¶
Get Parsl configuration for this site.
Subclasses can overwrite this method to build a more specific Parsl configuration, if required.
The retries are set from the
site.<computeSite>.retriesvalue found in the BPS configuration file.- Returns:
- config
parsl.config.Config The configuration to be used for Parsl.
- config
- static get_site_subconfig(config: BpsConfig) BpsConfig¶
Get BPS configuration for the site of interest.
We return the BPS sub-configuration for the site indicated by the
computeSitevalue, which issite.<computeSite>.- Parameters:
- config
BpsConfig BPS configuration.
- config
- Returns:
- site
BpsConfig Site sub-configuration.
- site
- make_executor(label: str, *, nodes: int | None = None, cores_per_node: int | None = None, walltime: str | None = None, mem_per_node: int | None = None, mem_per_worker: float | None = None, qos: str | None = None, constraint: str | None = None, singleton: bool = False, scheduler_options: str | None = None, provider_options: dict[str, Any] | None = None, executor_options: dict[str, Any] | None = None) ParslExecutor¶
Return an executor for running on a Slurm cluster.
- Parameters:
- label
str Label for executor.
- nodes
int, optional Default number of nodes for each Slurm job.
- cores_per_node
int, optional Default number of cores per node for each Slurm job.
- walltime
str, optional Default time limit for each Slurm job.
- mem_per_node
float, optional Memory per node (GB) to request for each Slurm job.
- mem_per_worker
float, optional Minimum memory per worker (GB), limited by the executor.
- qos
str, optional Quality of service for each Slurm job.
- constraint
str, optional Node feature(s) to require for each Slurm job.
- singleton
bool, optional Wether to allow only a single Slurm job to run at a time.
- scheduler_options
str, optional #SBATCHdirectives to prepend to the Slurm submission script.- provider_options
dict, optional Additional arguments for
SlurmProviderconstructor.- executor_options
dict, optional Additional arguments for
HighThroughputExecutorconstructor.
- label
- Returns:
- executor
HighThroughputExecutor Executor for Slurm jobs.
- executor