parastation.conf

parastation.conf
Prev	Reference Pages	Next

Name

parastation.conf — the ParaStation MPI configuration file

Description

Upon execution, the ParaStation MPI daemon psid(8) reads its configuration information from a configuration file which, by default, is /etc/parastation.conf. There are various parameters that can be modified persistently within this configuration file.

The main syntax of the configuration file is one parameter per line. Due to ease of use there are some parameters, e.g. Nodes, that are implemented in an environment mode. This mode enables the setting of multiple parameters by a single command. Environment mode parameters may comprise more than one line.

Line continuation is possible. If the last character within a line before the newline character is a "\", the newline character will be ignored and the next line is appended to the current line.

Comments are starting with a "#". All remaining characters on the line will be ignored. Keep in mind that line continuation also works within comments, i.e. if the last character of the line is a "\", the next line will be ignored, too.

The parser used to analyze parastation.conf is not case sensitive. This means, that all keywords within the configuration file may be written in any combination of upper- and lowercase characters. Within this document a mixed upper-/lowercase notation is used to provide more readable keywords. The same notation is used in the configuration file template parastation.conf.tmpl contained in the distributed ParaStation MPI system. The template file can be found in /opt/parastation/config.

Parameters

The different parameters are discussed in the order they should appear within the configuration file. Dependencies between parameters - resulting in a defined order of parameters - are marked explicitly.

Some parameters may be modified using different keywords, e.g. both InstallDir and InstallationDir modify the directory where the ParaStation MPI daemon psid(8) expects the ParaStation MPI system installed. In case of different keywords modifying the same resource, all keywords are mentioned in front of the parameter's discussion.

Only few parameters have to be declared in any case in order to enable ParaStation MPI to run on a cluster. These parameters are HWType and Nodes.

If parameters are declared more than once, the latest declaration is the one to use. Do not make use of this behavior as a feature since it may create great pitfalls.

InstallDir inst-dir , InstallationDir inst-dir

Tell the ParaStation MPI daemon to find all the ParaStation MPI related files in inst-dir. The default is /opt/parastation.

Hardware name

Tell the ParaStation MPI daemon how to handle a distinct hardware. Usually it is not necessary to edit these entries, since the template version of the configuration file contains up to date entries of all supported hardware types. Furthermore a deeper insight into the low-level functionality of ParaStation MPI is needed in order to create such an entry.

Nevertheless a brief overview on the structure of the Hardware entries is given here.

The following five types of parameters within the Hardware environment will get a special handling from the ParaStation MPI daemon psid(8). These define different script files called in order to execute various operations towards the corresponding communication hardware.

All these entries have the form of the parameter's name followed by the corresponding value. The value might be enclosed by single or double quotes in order to allow a space within.

The values are interpreted as absolute or relative paths. Relative paths will be looked up relative to InstallDir. If one or more of the scripts are not defined, no corresponding action will take place for this hardware.

startscript: Define a script called in order to startup the corresponding communication hardware. This script will be executed when the daemon starts up or after a reset of the communication hardware.
stopscript: Define a script called in order to shutdown the corresponding communication hardware. This script will be executed when the daemon exits or before a reset of the communication hardware.
setupscript: Define a script called in order to set special parameters on the corresponding communication hardware.
statusscript: Define a script called in order to get a status message from the corresponding communication hardware. This is mainly used in order to generate the lines shown be the status counter directive of the ParaStation MPI administration tool psiadmin(1).
headerscript: Define a script called in order to get a header line for the status message produced by the above discussed statusscript .

All further parameters defined within a Hardware section are interpreted as environment variables when calling the above defined scripts. Again these parameters have the form of the parameters name - interpreted as the environments variables name - followed by the corresponding value. The values might be single strings not containing whitespace characters or enclosed by single or double quotes, too.

The impact of the environment variables on the scripts of course depend on the scripts itself.

Various hardware types are defined within the template configuration file coming with the ParaStation MPI software distribution. These hardware types, the corresponding scripts and the environment variables the scripts understand are briefly discussed within the following lines.

Note

Shared memory will be used as hardware type for communication within a SMP node. As there are no options for this kind of hardware, no dedicated section is provided.

ethernet

Use classical TCP/IP communication over Ethernet via an optimized MPI implementation.

Since TCP/IP has to be configured before ParaStation MPI starts up, the corresponding script ps_ethernet has almost nothing to do and hence does not understand a single environment variable.

p4sock

Use optimized communication via (Gigabit) Ethernet.

The script handling this hardware type ps_p4sock is also located in the config subdirectory. It understands the following two environment variables:

PS_TCP: If set to an address range, e.g. 192.168.10.0-192.168.10.128, the TCP bypass feature of the p4sock protocol is enabled for the given address range.

openib

Use the OpenFabrics verbs layer for communication over InfiniBand.

No script is currently implemented for this communication protocol, therefore no environment variables are recognized.

mvapi

Use the Mellanox verbs layer for communication over InfiniBand.

No script is currently implemented for this communication protocol, therefore no environment variables are recognized.

gm

Use communication over GM (Myrinet).

The script ps_gm will load the Myrinet gm driver.

PS_IPENABLED: If set to 1, the IP device myri0 is enabled after loading.

elan

Use communication over QsNet (libelan).

No script is currently implemented for this communication protocol, therefore no environment variables are recognized.

This communication layer is currently not supported by the ParaStation MPI communication library, therefore only programs linked with the QsNet MPI will work.

ipath

Use communication over InfiniPath.

No script is currently implemented for this communication protocol, therefore no environment variables are recognized.

This communication layer is currently not supported by the ParaStation MPI communication library, therefore only programs linked with the InfiniPath MPI will work.

dapl

Use communication over a generic DAPL layer.

No script is currently implemented for this communication protocol, therefore no environment variables are recognized.

accounter

This is actually a pseudo communication layer. It is only used for configuring nodes running the ParaStation MPI accounting daemon and should be used only in a particular Nodes entry.

NrOfNodes num

This configuration parameter is no longer required and will be silently ignored.

HWType { ethernet | p4sock | openib | mvapi | gm | elan | dapl | none }

HWType { { ethernet | p4sock | openib | mvapi | gm | elan | dapl | none }... }

Define the default communication hardware available on the nodes of the ParaStation MPI cluster. This may be overruled by an explicit HWType option in a Node statement.

The hardware types used within this command have to be defined in Hardware declarations before.

Further hardware declarations might be defined by the user, but this is pretty much undocumented.

It is possible to enable more than one hardware type, either as default or on a per node basis.

The default value of HWType is none.

starter { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, all nodes declared within a Node statement will allow to start parallel tasks, unless otherwise stated.

If the argument is one of no, false or 0, starting will be not allowed.

It might be useful to prohibit the startup of parallel task from the frontend machine if a batch system is used. This will force all users to use the batch system in order to start their tasks. Otherwise it would be possible to circumvent the batch system by starting parallel task directly from the frontend machine.

The default is to allow the starting of parallel tasks from all nodes.

runJobs { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, all nodes declared within a Node statement will allow to run processes of parallel tasks, unless otherwise stated.

If the argument is one of no, false or 0, ParaStation MPI will not start processes on these nodes.

It might be useful to prohibit the start of processes on a frontend machine since usually this machine is reserved for interactive work done by the users. If the execution of processes is forbidden on a distinct node, parallel tasks might be started from this node anyhow.

The default is to allow all nodes to run processes of parallel tasks.

Node[s] hostname id [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ... }]

Node[s] { {hostname id [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ... }] }... }

Node[s] $GENERATE from-to/step nodestr idstr [HWType-entry] [starter-entry] [runJobs-entry] [env name value] [env { name value ... }]

Define one or more nodes to be part of the ParaStation MPI cluster.

This is the first example of a parameter that supports the environment mode. This means there are two different notations to use this parameter. The first one may be used to define a single node, the second one will allow to register more than one node within a single command. It is a convenient form that prevents from typing the keyword once per entry again and again.

Each entry has to have at least two items, the hostname and the id. This will tell the ParaStation MPI system that the node called hostname will act as the physical node with ParaStation MPI ID id.

hostname is either a resolvable hostname or an IP address in dot notation (e.g. 192.168.1.17). Id is an integer number in the range from 0 to maximum number of nodes minus one.

Further optional items as HWType-entry, starter-entry or runJobs-entry may overrule the default values of the hardware type on the node, the ability to start parallel jobs from this node or the possibility to run processes on this node respectively. These entries have the same syntax as the stand alone commands to set the corresponding default value.

E.g. the line

Node node17 16 HWType { ethernet p4sock } starter yes runJobs no

will define the node node17 to have the ParaStation MPI ID 16. Furthermore it is expected to have a Ethernet communication using both TCP and p4sock protocols. It is allowed to start parallel tasks from this node but the node itself will not run any process of any parallel task (except the ParaStation MPI logger processes of the tasks started on this node).

The option environment or env allows per node environment variables to be set. Using the first form, the variable name is set to value. More then one name/value pair may be given. More complex values may be given using quotation marks:

Node node17 16 environment LD_LIBRARY_PATH /mypath
Node node18 17 env { PSP_P4S "2" PSP_OPENIB "0" }

This example will define the variable LD_LIBRARY_PATH to /mypath for node node17 and the variables PSP_P4S and PSP_OPENIB to 2 and 0 for node node18.

The $GENERATE allows to define a group of nodes at once using a simple syntax. Using the parameters from and to, a range may be defined, incremented by step. Each entry in this range may be referenced within the nodestr and idstr using a syntax of $[{offset[,width[,base]]}]. Eg., the entry

$GENERATE 1-96  node${0,2} ${0}

define the nodes node01 up to node96 using the id's 1 - 96, respectively. More node specific attributes may be defined as described above.

LicenseServer hostname , LicServer hostname

LicenseFile lic-file , LicFile lic-file

LicenseDeadInterval num , LicDeadInterval num

These entries are silently ignored by this version of ParaStation MPI.

SelectTime time

Set the timeout of the central select(2) of the ParaStation MPI daemon psid(8) to time seconds.

The default value is 2 seconds.

Note

This parameter can be set during runtime via the set selecttime directive within the ParaStation MPI administration and management tool psiadmin(1).

DeadInterval num

The ParaStation MPI daemon psid(8) will declare other daemons as dead after num consecutively missing multicast pings.

After declaring a node as dead, all processes residing on this node are also declared dead. This results in sending signals to all processes on the local node that have requested to get informed about the death of one of these processes.

The default value is 10.

For now, the multicast period is set to two seconds, i.e. every daemon sends a multicast ping every two seconds. This results in declaring a daemon as dead after 20 seconds for the default value.

LogLevel num

Set the debugging level of the ParaStation MPI daemon psid(8) to num.

Note

For values of level larger than 10 the daemon logs a huge amount of message in the logging destination, which is usually the syslog(3).

This parameter can be set during runtime via the set psiddebug directive within the ParaStation MPI administration and management tool psiadmin(1).

LogDest { LOG_DAEMON | LOG_KERN | LOG_LOCAL[0-7] }

LogDestination { LOG_DAEMON | LOG_KERN | LOG_LOCAL[0-7] }

Set the logging output's destination for the ParaStation MPI daemon psid(8). Usually the daemon prints logging output using the syslog(3) mechanism, unless an alternative logging file is requested via psid(8)'s -l option.

In order to collect all the ParaStation MPI specific log messages into a special file, the facility argument of the openlog(3) function call in cooperation with a suitable setup of the syslogd(8) may be used. This parameter will set the argument to one of the mentioned values.

The default value is LOG_DAEMON.

MCastGroup group-num

Tell psid(8) to use the multicast group group-num for multicast communication to other daemons.

The default group to use is 237

MCastPort portno

Tell psid(8) to use the UDP port portno for multicast communication to other daemons.

The default port to use is 1889

RDPPort portno

Tell psid(8) to use the UDP port portno for the RDP communication protocol to other daemons.

The default port to use is 886.

RLimit { Core size | CPUTime time | DataSize size | MemLock size | StackSize size | RSSize size | NoFile num }

RLimit { { Core size | CPUTime time | DataSize size | MemLock size | StackSize size | RSSize size | NoFile num }... }

Set various resource limits to the psid(8) and thus to all processes started from it.

All limits are set using the setrlimit(2) system call. For a detailed description of the different types of limits please refer to the corresponding manual page.

If no RLimits are set within the ParaStation MPI configuration files, no changes are made to the systems default value.

The following (soft) resource limits may be set:

Core size: Set the maximum size of a core-file to size kilobytes. size is an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the data size is set to RLIM_INFINITY.
Note
Starting with version 5.0.3, this configuration will also control the writing of core-files for the psid itself, in case a catastrophic failure occurs.
CPUTime time: Set the maximum CPU time that might be consumed by the daemon to time seconds. time has to be an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the data size is set to RLIM_INFINITY.
DataSize size: Set the maximum data size to size kilobytes. size is an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the data size is set to RLIM_INFINITY.
MemLock size: Set the maximum amount of memory that might be locked into RAM to size kilobytes. size is an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the data size is set to RLIM_INFINITY.
StackSize size: Set the maximum stack size to size kilobytes. size is an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the stack is set to RLIM_INFINITY.
RSSize size: Set the maximum Resident Set Size (RSS) to size pages. size is an integer number, the string “infinity” or the string “unlimited”. In the two latter cases the RSS is set to RLIM_INFINITY.
NoFile num: Set the maximum number of open files to num. Be aware of the fact that inherited limits are confined by psid's hard limits.

Env | Environment name value

Env | Environment { {name value }... }

Set environment variables for the ParaStation MPI daemon psid(8) and any application started via this daemon.

This command again has two different modes. While within the first form exactly one variable is set, within the environment form of this command as many variables as wanted may be set. The general form of the latter case is one variable per line.

The value part of each line either is a single word or an expression enclosed by single or double quotes. The expression might contain whitespace characters. If the expression is enclosed by single quotes, it is allowed to use balanced or unbalanced double quotes within this expression and vice versa.

This command might be used for example in order to set the PSP_NETWORK environment variable globally without the need of every user to adjust this parameter in his own environment.

freeOnSuspend { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, suspending a task by sending the signal SIGTSTP to the logger will handle all resources (CPUs) currently claimed by this task as free.

If the argument is one of no, false or 0, ParaStation MPI will not claim resources as free after sending SIGTSTP.

handleOldBins { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, compatibility mode for applications linked with ParaStation MPI version 4.0 up to 4.0.6 will be enabled. Keep in mind that this behavior might collide with the freeOnSuspend feature.

If the argument is one of no, false or 0, ParaStation MPI will disable compatibility mode.

UseMCast { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, keep alive messages from the ParaStation MPI daemon psid(8) are sent using Multicast messages.

If the argument is one of no, false or 0, ParaStation MPI will use it's own RDP protocol for keep alive messages. This is the default.

PSINodesSort { PROC | LOAD_1 | LOAD_5 | LOAD_15 | PROC+LOAD | NONE }

Define the default sorting strategy for nodes when attaching them to a partition. The different possible values have the following meaning:

PROC: Sort by the number of processes managed by ParaStation MPI on the corresponding nodes
LOAD_1: Sort by the load average during the last minute on the corresponding nodes
LOAD_5: Sort by the load average during the last 5 minutes on the corresponding nodes
LOAD_15: Sort by the load average during the last 15 minutes on the corresponding nodes
PROC+LOAD: Sort conforming to the sum of the processes managed by ParaStation MPI and the load average during the last minute on the corresponding nodes
NONE: Do not sort at all.

This only comes into play, if the user does not define a sorting strategy explicitly via PSI_NODES_SORT. Be aware of the fact that using a batch-system like PBS or LSF *will* set the strategy explicitly, namely to NONE.

overbook { true | yes | 1 | false | no | 0 }

If the argument is one of yes, true or 1, all nodes may be overbooked by the user using the PSI_OVERBOOK environment variable.

If the argument is one of no, false or 0, ParaStation MPI will deny overbooking of the nodes, even if PSI_OVERBOOK is set.

It might be useful to prohibit the start of processes on a frontend machine since usually this machine is reserved for interactive work done by the users. When the execution of processes is forbidden on a distinct node, parallel task might be started from this node anyhow.

The default is to allow all nodes to run processes of parallel tasks.

processes maxprocs

Define the maximum number of processes per node.

This parameter can be set during runtime via the set maxproc directive within the ParaStation MPI administration and management tool psiadmin(1).

pinProcs { true | yes | 1 | false | no | 0 }

Enables or disables process pinning for compute tasks. If enabled, tasks will be pinned down to particular CPU-slots. The mapping between those CPU-slots and physical CPUs and cores is made using a mapping list. See CPUmap below.

The pinProcs parameter can be set during runtime via the set pinprocs directive within the ParaStation MPI administration and management tool psiadmin(1).

bindMem { true | yes | 1 | false | no | 0 }

This parameter must be set to true if nodes providing non-Uniform memory access (NUMA) should use 'local' memory for the tasks.

This parameter can be set during runtime via the set bindmem directive within the ParaStation MPI administration and management tool psiadmin(1).

CPUmap { map }

Set the map used to assign CPU-slots to physical cores to map. Map is a quoted string containing a space-separated permutation of the number 0 to Ncore-1. Here Ncore is the number of physical cores available on this node. The number of cores within a distinct node may be determined via list hw. The first number in map is the number of the physical core the first CPU-slot will be mapped to, and so on.

This parameter can be set during runtime via the set bindmem directive within the ParaStation MPI administration and management tool psiadmin(1).

supplGrps { true | yes | 1 | false | no | 0 }

This parameter must be set to true if processes spawned by ParaStation MPI should belong to all groups defined for this user. Otherwise, they will only belong to the primary group.

This parameter can be set during runtime via the set supplementaryGroups directive within the ParaStation MPI administration and management tool psiadmin(1).

rdpMaxRetrans number

Set the maximum number of retransmissions within the RDP facility. If more than this number of retransmission would have been necessary to deliver the packet to the remote destination, this connection is declared to be down.

Errors

No known errors.

Name

Description

Parameters

Note

Note

Note

Note

Errors

See also