Parallel Tasks started by ParaStation MPI can be suspended by sending the
system signal SIGTSTP
to the ParaStation MPI Logger process. The signal will be forwarded to
all processes of the parallel task and will by default stop the
processes.
To continue, the SIGCONT
must be sent to
the ParaStation MPI Logger process. This signal will also
be forwarded to all processes of the task.
The application has to be prepared to handle interrupted system calls properly.
Depending on the transport protocol in use, tasks can be
suspended only for a limited period time. If using TCP (HwType
ethernet), connections may timeout and after sending the
SIGCONT
signal, the processes will receive
I/O errors for this sockets. Using the ParaStation MPI protocol
p4sock will solve this problem, as this
protocol does not use any timeout features.
Suspending a task using the signal SIGTSTP
will also trigger the ParaStation MPI queuing facility (see the section called “Using the ParaStation MPI queuing facility”). Depending of the global setting
of freeOnSuspend
, CPUs will be reused for
newly spawned processes. Refer to
parastation.conf(5).