Within this section, a brief technical overview of ParaStation MPI will be given. The various software modules constituting ParaStation MPI are explained.
In order to enable ParaStation MPI on a cluster, the ParaStation daemon psid(8) has to be installed on each cluster node. This daemon process implements various functions:
Install and configure local communication devices and protocols, e.g. load the p4sock kernel module and set up proper routing information, if not already done at system startup.
Queue parallel and serial tasks until requested resources are available.
Distribute processes onto the available cluster nodes.
Startup and monitor processes on cluster nodes. Also terminate and cleanup processes upon request.
Monitor availability of other cluster nodes, send “I'm alive” messages.
Handle input/output and signal forwarding.
Service management commands from the administration tools.
The daemon processes periodically send information containing application processes, system load and others to all other nodes within the cluster. So each daemon is able to monitor each other node, and in case of absent alive messages, it will initiate proper actions, e.g. terminate a parallel task or mark this node as "no longer available". Also, if a previously unavailable node is now responding, it will be marked as "available" and will be used for upcoming parallel task. No intervention of the system administrator is required.