Slurm¶
- class lsst.ctrl.bps.parsl.sites.Slurm(config: BpsConfig, add_resources: bool = False)¶
Bases:
SiteConfig
Configuration for generic Slurm cluster.
This can be used directly as the site configuration for a Slurm cluster by setting the BPS config, e.g.:
computeSite: slurm site: slurm: class: lsst.ctrl.bps.parsl.sites.Slurm nodes: 3 cores_per_node: 20 walltime: "00:59:00" # Note: always quote walltime in YAML
Alternatively, it can be used as a base class for Slurm cluster configurations.
The following BPS configuration parameters are recognised (and required unless there is a default mentioned here, or provided by a subclass):
nodes
(int
): number of nodes for each Slurm job.cores_per_node
(int
): number of cores per node for each Slurm job; by default we use all cores on the node.walltime
(str
): time limit for each Slurm job.mem_per_node
(int
): memory per node (GB) for each Slurm job; by default we use whatever Slurm gives us.qos
(str
): quality of service to request for each Slurm job; by default we use whatever Slurm gives us.singleton
(bool
): allow only one job to run at a time; by defaultFalse
.scheduler_options
(str
): text to prepend to the Slurm submission script (each line usually starting with#SBATCH
).
Methods Summary
from_config
(config)Get the site configuration nominated in the BPS config.
Return the IP address of the machine hosting the driver/submission.
Return command(s) to add before each job command.
Get a list of executors to be used in processing.
Get parsl monitor.
Get Parsl configuration for this site.
get_site_subconfig
(config)Get BPS configuration for the site of interest.
make_executor
(label, *[, nodes, ...])Return an executor for running on a Slurm cluster.
select_executor
(job)Get the
label
of the executor to use to execute a job.Methods Documentation
- classmethod from_config(config: BpsConfig) SiteConfig ¶
Get the site configuration nominated in the BPS config.
The
computeSite
(str
) value in the BPS configuration is used to select a site configuration. The site configuration class to use is specified by the BPS configuration assite.<computeSite>.class
(str
), which should be the fully-qualified name of a python class that inherits fromSiteConfig
.- Parameters:
- config
BpsConfig
BPS configuration.
- config
- Returns:
- site_configsubclass of
SiteConfig
Site configuration.
- site_configsubclass of
- get_address() str ¶
Return the IP address of the machine hosting the driver/submission.
This address should be accessible from the workers. This should generally by the return value of one of the functions in
parsl.addresses
.This is used by the default implementation of
get_monitor
, but will generally be used byget_executors
too.This default implementation gets the address from the hostname, but that will not work if the workers don’t access the driver/submission node by that address.
- get_command_prefix() str ¶
Return command(s) to add before each job command.
These may be used to configure the environment for the job.
This default implementation respects the BPS configuration elements:
- get_executors() list[parsl.executors.base.ParslExecutor] ¶
Get a list of executors to be used in processing.
Each executor should have a unique
label
.
- get_monitor() MonitoringHub | None ¶
Get parsl monitor.
The parsl monitor provides a database that tracks the progress of the workflow and the use of resources on the workers.
This implementation respects the BPS configuration elements:
- get_parsl_config() Config ¶
Get Parsl configuration for this site.
Subclasses can overwrite this method to build a more specific Parsl configuration, if required.
The retries are set from the
site.<computeSite>.retries
value found in the BPS configuration file.- Returns:
- config
parsl.config.Config
The configuration to be used for Parsl.
- config
- static get_site_subconfig(config: BpsConfig) BpsConfig ¶
Get BPS configuration for the site of interest.
We return the BPS sub-configuration for the site indicated by the
computeSite
value, which issite.<computeSite>
.- Parameters:
- config
BpsConfig
BPS configuration.
- config
- Returns:
- site
BpsConfig
Site sub-configuration.
- site
- make_executor(label: str, *, nodes: int | None = None, cores_per_node: int | None = None, walltime: str | None = None, mem_per_node: int | None = None, mem_per_worker: float | None = None, qos: str | None = None, constraint: str | None = None, singleton: bool = False, scheduler_options: str | None = None, provider_options: dict[str, Any] | None = None, executor_options: dict[str, Any] | None = None) ParslExecutor ¶
Return an executor for running on a Slurm cluster.
- Parameters:
- label
str
Label for executor.
- nodes
int
, optional Default number of nodes for each Slurm job.
- cores_per_node
int
, optional Default number of cores per node for each Slurm job.
- walltime
str
, optional Default time limit for each Slurm job.
- mem_per_node
float
, optional Memory per node (GB) to request for each Slurm job.
- mem_per_worker
float
, optional Minimum memory per worker (GB), limited by the executor.
- qos
str
, optional Quality of service for each Slurm job.
- constraint
str
, optional Node feature(s) to require for each Slurm job.
- singleton
bool
, optional Wether to allow only a single Slurm job to run at a time.
- scheduler_options
str
, optional #SBATCH
directives to prepend to the Slurm submission script.- provider_options
dict
, optional Additional arguments for
SlurmProvider
constructor.- executor_options
dict
, optional Additional arguments for
HighThroughputExecutor
constructor.
- label
- Returns:
- executor
HighThroughputExecutor
Executor for Slurm jobs.
- executor