Chapter 5. Building native ParaStation MPI applications

Table of Contents

Using the management facility
The root process
The master process
Client processes
Using the PSPort interface

This chapter describes briefly how to write native ParaStation MPI applications, i.e. applications that make use of the ParaStation MPI low level communication libraries and the ParaStation MPI management facilities without utilizing the standard MPI libraries.

In general it is not necessary to write native ParaStation MPI applications. Since the improvement in communication performance from circumventing the high level MPI library is negligible, in almost any scenario the utilization of MPI is the recommended way. Additionally MPI guarantees the portability of the programs to be developed.

Nevertheless the source code of a simple application comes with the standard ParaStation MPI software distribution. The file is named native.c and is located in the /opt/parastation/doc/examples directory. Furthermore a Makefile can be found in the same directory.

Taking a brief look at native.c it has to be noticed that the startup of the application and the necessary communication channels is quite complex. In the case of using MPI all this functionality is hidden within the simple MPI_Init(3) function. Thus the programmer does not have to take care about the details of the startup process. This is a further argument towards using MPI.

As already mentioned the usage of the ParaStation MPI API is only discussed briefly. A detailed description of the various function calls forming the API may be found in the API reference.

Using the management facility

The ParaStation MPI management system is used by native application in order to startup the processes of the parallel task and to assure the cleaning up by the ParaStation MPI daemons in the case of failure of one or more processes within the task.

In order to contact the local ParaStation MPI daemon and to initialize the ParaStation MPI library utilized to contact the ParaStation MPI management facility, PSE_initialize(3) has to be called. The rank of the actual process then is determined using the PSE_getRank(3) function.

Based on the rank the decision is made which function is inherited by the actual process. Three cases have to be distinguished:

rank = -1

The process is the very first process started within the parallel task. Thus it is the root of all further processes. When the startup phase is finished this process will become a ParaStation MPI logger process and handle the standard I/O of the parallel task.

rank = 0

This is the so called master process of the parallel task. It is spawned by the root process. Its function is firstly to spawn all further processes with rank > 0 and then to act as a normal compute process within the parallel task.

rank > 0

These are normal compute processes started by the master process.

These three kinds of processes will be discussed within the next sections.

The root process

Both functions this process has to fulfill are handled almost completely within one function call. Thus nearly the entire functionality is hidden from the user's point of view.

But first PSE_setHWType(3) has to be called in order to specify the type of communication hardware the parallel task should utilize. This is a kind of pre-configuration of the following function call in order to spawn further processes.

All the rest that has to be done by this process is hidden within the PSE_spawnMaster(3) function: The master process is spawned and the process is converted into a ParaStation MPI logger process. This function usually never returns.

The master process

The first action the master process, as any other compute process, has to achieve is to register to its parent process via PSE_registerToParent(3). This is done in order to be noticed if the parent process exits unexpectedly. The usual behavior is to receive a SIGTERM signal in the case that this happens.

After it is registered to its parent process further processes, in this context called client processes, are going to be spawned. The target is to obtain the requested number of processes within the parallel task. This is done by calling the PSE_spawnTasks(3) function. At this point it has to be remarked that usually information has to be passed from the master process to its clients. This is due to enable the clients to connect back to the master in order to establish the connections used for communication. Thus the node and port number of the master process' communication interface has to be passed to the PSE_spawnTasks(3) function as well.

How to get an interface to the low level communication protocols provided by ParaStation MPI and to fetch information about this interface will be discussed within the next section.

Client processes

The client processes spawned by the master process have to register to their parent process, too. Thus they call PSE_registerToParent(3) as one of the first actions undertaken after the rank is determined.

For the further operation of the client processes it is necessary to get the information passed by the master process to them. This can be reached by using PSE_getMasterNode(3) and PSE_getMasterPort(3) respectively.

All further actions attempted by the master process on the one side and the client processes on the other side within the startup phase are concerning the establishment of the connections in order to do communication. These will be discussed within the next section.

After the startup phase all processes of the parallel task will reach a mode of normal operation, which usually will cover the actual application code. Within the application two kinds of exit mechanisms might wanted to be used. On the one hand an error detected within one process should result in the end of the whole parallel task. This can be reached by using the PSE_abort(3) function. On the other hand one of the processes might have finished all its tasks and want to exit without disturbing the other processes within the parallel task. In order to do so, PSE_finalize(3) has to be called. Afterwards the process might exit without shutting down all other processes.