psiadmin — the ParaStation MPI administration and management tool
psiadmin [ -denqrsv? ] [ -c command
] [ -f program-file
] [ --usage ]
The psiadmin command provides an administrator interface to the ParaStation MPI system.
The command reads directives from standard input in interactive mode. The syntax of each directive is checked and the appropriate request is sent to the local ParaStation MPI daemon psid(8).
In order to send psiadmin into batch mode, either use the
-c
or the -f
. The syntax of the
directives is exactly the same as in interactive mode for both options.
Most of the directives listed below can be executed by general users. Only modifying parameters, killing foreign jobs and shutting down single nodes or the whole system requires root privilege.
-c
, --command=command
Execute the single directive command and exit.
-d
Do not automatically start up the local psid(8).
-e
, --echo
Echo each executed directive to stdout.
-f
, --file=program-file
Read commands from the file program-file
.
Exit as soon as EOF is reached.
It might be useful to enable echoing (-e
) when
acting on a script file.
This option silently enables the -q
option
suppressing the prompt.
-n
, --noinit
Ignore the initialization file
.psiadminrc
.
-q
, --quiet
Suppress printing the prompt each time waiting for a new
command. This is useful in combination with the -f
option.
-s
, --start-all
Try to start all daemons within the cluster. This option is equivalent to the execution of the add directive straight after the startup of the administration tool.
-r
, --reset
Do a reset of the ParaStation MPI system on startup.
-v
, --version
Output version information and exit.
-?
, --help
Show a help message.
--usage
Display a brief usage message.
The psiadmin command reads standard input for directives until end of file is reached, or the exit or quit directive is read.
If Standard Output is connected to a terminal, a command prompt will be written to standard output when psiadmin is ready to read a directive.
If the -e
option is specified, psiadmin will echo the
directives read from standard input to standard output.
The psiadmin command will write a diagnostic message to standard error for each error occurred.
If psiadmin is invoked without the -c
or
-f
option and standard output is connected to a
terminal, psiadmin will repeatedly write a prompt to standard output and
read a directive from standard input.
Directives can be abbreviated to their minimum unambiguous form. A directive is terminated by a new line character or a semicolon. Multiple directives may be entered on a single line. A directive may extend across lines by escaping the new line character with a back-slash "\".
Comments begin with the # character and continue to end of the line. Comments and blank lines are ignored by psiadmin.
Upon startup psiadmin tries to find the file
.psiadminrc
first in the current directory and then
in the user's home directory. Only the first one found is really
considered. Each directive found within this file is handled silently
before going either into interactive or batch mode (using the
-f
flag).
Whenever the psiadmin is started into interactive mode, it will prompt
for directives unless the -q
flag is used. The same
directives are accepted in batch mode, too.
Directives may be abbreviated as long as they are unique. They
can be expanded using the TAB-key, analogous to some shell tab
expansion features. A command history is stored in
~/.psiadm_history. See readline(3) for more information on command expansion and
command history.
Almost all directives accept an optional parameter
nodes
. This contains either a comma-separated
list of node ranges to act on, each of the form
from
[-to
].
If the to
part is missing, the single node
from
is represented by this range. In
principle nodes
might contain an unlimited
number of ranges.
Otherwise the value of nodes
might be
all
. Then all nodes of the ParaStation MPI cluster are selected
within this directive.
If nodes
is empty, the node range preselected
via the range command is used. The default preselected
node range contains all nodes of the ParaStation cluster.
The from
and to
parts of each range are node IDs. They might be given in decimal or
hexadecimal notation and must be in the range between
0
and NumberOfNodes-1
.
As an extension nodes
might also be a hostname
that can be resolved into a valid ParaStation MPI ID.
Using hostnames containing "-" might confuse this algorithm and is therefore not recommended.
exit
Exit the interactive mode of psiadmin. Same as quit.
help [directive
]
Print a help message concerning
. If
directive
directive
is missing, a general help
message is displayed.
kill [-sig
] tid
Send the process with the task ID tid
the
signal sig
.
sig
has to be a positive number
representing a UNIX signal. For a list of available signals please
consult the signal(7) manual page. If sig
is
not given, a SIGTERM signal (i.e. 15) is sent to the corresponding
process.
Processes can only be signaled by their owner or by root.
list [ all | allproc [cnt
count
] | count [hw
hw
] | down | hardware | load | mcast | memory | node | proc [cnt
count
] | rdp | summary [max
max
]
| up | version | startupscript | starttime | environment | rdpconnection | nodeupscript | nodedownscript ] [nodes
]
list jobs
[ state running | state pending | state suspended ]
slots [tid
]
Report various states of the selected node(s) or job(s).
Depending on the
given argument, different information can be requested from the
ParaStation MPI system. If no argument is given, the node
information is retrieved.
all
Show the information given by node, count and proc on the selected node(s).
allproc [cnt count
]
Show all processes managed by the ParaStation MPI system on the selected node(s).
All processes - including forwarder and other special
processes - managed by ParaStation MPI are displayed. If forwarder
processes should not be displayed, use the list
proc
directive.
Up to count
processes per node are
displayed. If more processes are controlled by ParaStation MPI on this
node, a short remark will tell the number of not displayed
processes. The default is to show 10 processes.
The output fields of the process list are described within
the list proc
directive. In addition to the process classes described
there, ParaStation MPI Forwarder
processes, i.e. processes spawned by the ParaStation MPI daemon psid(8) in order to control a spawned process, are
marked by “(F)” after the user ID. Further
helper processes needed in order to spawn non ParaStation MPI
applications are marked with “(S)”.
count [hw hw
]
List the status of the communication system(s) on the selected node(s). Various counters are displayed.
If the hw
option is given, only the counters
concerning the hw
hardware type
are displayed. The default is to display the counters of all
enabled hardware types on this node.
down
List all nodes which are marked as "DOWN".
hardware
Show the hardware setup on the selected node(s).
Besides the types of the communication hardware enabled within the ParaStation MPI system on each node also the number of available CPUs are displayed. The two numbers shown in this column mark the number of virtual and physical CPUs respectively. These number might differ due to technologies like Intel's HyperThreading or multi core CPUs.
load
Show the load and the number of processes managed by the ParaStation MPI system on the selected node(s).
The three load values displayed are the averages for 1, 5 and 15 minutes respectively. The two numbers of processes are as follows: The total number of processes contains all processes managed by the ParaStation MPI system, including Logger, Forwarder and psiadmin(1) processes. Furthermore of course the actual working processes started by the users are included. The latter ones are the “normal” processes, additionally displayed in the last column of the output.
mcast
List the status of the MCast facility of the ParaStation MPI daemon psid(8) on the selected node(s).
memory
Show the overall and available memory on the selected node(s).
node
List the status of the selected node(s). Depending on the state of the ParaStation MPI daemons, the node(s) are marked to be "UP" or "DOWN".
proc [cnt count
]
Show the processes managed by the ParaStation MPI system on the selected node(s).
Only user, logger and admin processes are displayed. If
forwarder and other special processes should also be
displayed, use the list
allproc
directive.
Up to count
processes per node are
displayed. If more processes are controlled by ParaStation MPI on this
node, a short remark will tell the number of not displayed
processes. The default is to show 10 processes.
The listed fields have the following meaning:
The ParaStation MPI ID of the node the process is running on.
The ParaStation MPI task ID of the process, both as decimal and hexadecimal number. The task ID of a process is unique within the cluster and is composed out of the ParaStation MPI ID of the node the process is running on and the local process ID of the process, i.e. the result of calling getpid(2).
The ParaStation MPI task ID of the parent process. The parent process is the one which has spawned the current process. If the process was not spawned by any other controlled by ParaStation MPI, i.e. it is the first process started within a parallel task, the parent ParaStation MPI task ID is 0.
Flag to mark if the process has reconnected to its local ParaStation MPI daemon psid(8). If a 1 is displayed, the process has connected the daemon, otherwise 0 is reported.
The user ID under which the process runs. This is usually identical to the user ID of the parent process.
Furthermore administrative processes, i.e. psiadmin(1) processes connected to a local daemon are marked with “(A)” after the user ID.
Logger processes, i.e. root processes of parallel tasks which converted to a ParaStation MPI Logger process, are tagged with “(L)” after the user ID.
System processes, which are not counted, are marked as “(*)”. Accounting processes are indicated by “(C)”. Other helper processes are marked with “(S)”.
jobs [ state running | state pending | state suspended ] [slots] [tid
]
Show all or selected jobs managed by the ParaStation MPI system.
If selected, only jobs with state
running,
pending or suspended
are shown.
if slots is provided, node and CPU count
information for this job is printed, too.
If tid
is given,
information for this particular job is shown.
For each job, information about the RootTaskId, state (= 'R', 'P' or 'S'), size (= number of CPUs), UID, GID, target slots and start time is printed.
rdp
List the status of the RDP protocol of the ParaStation MPI daemon psid(8) on the selected node(s).
This directive now displays for each connection the state, the partner's IP-address, total number of frames sent, number of pending ACKs, number of pending frames, number of retransmissions of the frame in flight and total number of retransmission during connection. If collecting the RDP statistics is enabled, the mean-time to ACK is displayed, too.
summary [max max
]
Print a brief summary of the active and down nodes. Thus
the number of up and down nodes will be printed out in one
line. If any node is down and the number of down nodes is less
than 20 or max
, then the
node IDs of this nodes will also printed out in a
second line.
up
List all nodes which are marked as "UP".
version
List the ID, psid revision and RPM version for the selected node(s).
startupscript
Show the daemon's script called during startup in order to test the local situation on the selected nodes.
starttime
List the startup time of the ParaStation daemons currently running.
environment key
env
List the daemons environment variable
env
and its value on the
selected nodes. If key env
is omitted, the entire environment will be
displayed.
rdpconnection
Show info on RDP connections on the selected nodes.
nodeupscript
Show the daemon's script called on the master node whenever a daemon connects after being down before on the selected nodes.
nodedownscript
Show the daemon's script called on the master node whenever a daemon disconnects on the selected nodes.
quit
Exit the interactive mode of psiadmin. Same as exit.
range {[n1-n10] | [n1,n2,..] | all }
Preselect or display the default set of nodes
The range command is used to preselect a range of nodes. Any subsequent PSIAdmin
commands issued after the range command will only be applied to those nodes defined in the range.
Ranges can be specified as contiguous sets n1-n2
or as individual comma separated hosts n1,n2
.
Mixes of continuous sets and individual hosts are permitted as shown in the following example:
PSIAdmin> range n1-n3,n5,n7-n10
show { maxproc | user | group | psiddebug | selecttime | statustimeout | statusbroadcasts | deadlimit | rdpdebug | rdptimeout | rdppktloss | rdpmaxretrans | rdpresendtimeout | rdpretrans | rdpclosedtimeout | rdpmaxackpend | rdpstatistics | mcastdebug | master | freeonsuspend | fos | handleoldbins | hob | starter | runjobs | overbook | exclusive | pinprocs | bindmem | cpumap | allowUserMap | nodessort | supplementaryGroups | maxStatTry | adminuser | admingroup | accounters | accountpoll | rl_addressspace | rl_as | rl_core | rl_cpu | rl_data | rl_fsize | rl_locks | rl_memlock | rl_msgqueue | rl_nofile | rl_nproc | rl_rss | rl_sigpending | rl_stack } [nodes
]
Show various parameters of the ParaStation MPI system.
accounters [nodes
]
Show information on which node(s) ParaStation MPI accounting processes are running.
user [nodes
]
Show who grants exclusive access on the selected node(s).
group [nodes
]
Show which group grants exclusive access on the selected node(s).
maxproc [nodes
]
Show the maximum number of ParaStation MPI processes on the selected node(s).
selecttime [nodes
]
Show the timeout of the central select(2) of the ParaStation MPI daemon psid(8) on the selected node(s).
psiddebug [nodes
]
Show the debugging mask of the ParaStation MPI daemon psid(8) on the selected node(s).
rdpdebug [nodes
]
Show the debugging mask of the RDP protocol within the ParaStation MPI daemon psid(8) on the selected node(s).
rdpretrans [nodes
]
Show the RDP retransmit counters off the selected node(s).
mcastdebug [nodes
]
Show the debugging mask of the MCast protocol within the ParaStation MPI daemon psid(8) on the selected node(s).
master [nodes
]
Show the current master on the selected node(s).
The master node's task is the management and allocation of resources within the cluster. It is elected among the running nodes during runtime. Thus usually all nodes should give the same answer to this question. In rare cases - usually during startup or immediately after a node failure - the nodes might disagree on the elected master node. This command helps on identifying these rare cases.
freeOnSuspend [nodes
]
Show the freeOnSuspend
flag on the
selected nodes.
The freeOnSuspend
flag steers the
behavior of the resource management concerning suspended
jobs. Basically there are two possible approaches: Either the
resources used by the suspended job are freed for other jobs
(this is done, if the flag is set to 1) or they are kept
occupied in order to preserve them exclusively for the time
the job continues to run (this is the behavior as long as the
flag has the value 0).
Since the master node does all the resource management within the cluster, only the value on this node actually steers the behavior.
handleOldBins [nodes
]
Show the compatibility flag for applications linked against version 4.0.x of ParaStation MPI on the selected nodes.
nodesSort [nodes
]
Show the default sorting strategy used when attaching nodes to partitions.
Since the master node does all the resource management within the cluster, only the value on this node actually steers the behavior.
starter [nodes
]
Show if the selected node(s) are allowed to start parallel tasks.
runjobs [nodes
]
Show if the selected node(s) are allowed to run tasks.
overbook [nodes
]
Show if the selected node(s) are allowed to be overbooked on user request.
rdppktloss [nodes
]
Show RDP protocol's packet-loss rate.
rdpmaxretrans [nodes
]
Show RDP protocol's maximum retransmission count.
exclusive [nodes
]
Show flag marking if this nodes can be requested by users exclusively.
pinprocs [nodes
]
Show flag marking if this nodes uses process pinning.
cpumap [nodes
]
Show the CPU-slot to core mapping list for the selected nodes.
bindmem [nodes
]
Show flag marking if this nodes uses binding as NUMA policy.
adminuser [nodes
]
Show users allowed to start admin-tasks, i.e. unaccounted tasks.
admingroup [nodes
]
Show groups allowed to start admin-tasks, i.e. unaccounted tasks.
rl_addressspace [nodes
]
Show RLIMIT_AS on this node.
rl_core [nodes
]
Show RLIMIT_CORE on this node.
rl_cpu [nodes
]
Show RLIMIT_CPU on this node.
rl_data [nodes
]
Show RLIMIT_DATA on this node.
rl_fsize [nodes
]
Show RLIMIT_FSIZE on this node.
rl_locks [nodes
]
Show RLIMIT_LOCKS on this node.
rl_memlock [nodes
]
Show RLIMIT_MEMLOCK on this node.
rl_msgqueue [nodes
]
Show RLIMIT_MSGQUEUE on this node.
rl_nofile [nodes
]
Show RLIMIT_NOFILE on this node.
rl_nproc [nodes
]
Show RLIMIT_NPROC on this node.
rl_rss [nodes
]
Show RLIMIT_RSS on this node.
rl_sigpending [nodes
]
Show RLIMIT_SIGPENDING on this node.
rl_stack [nodes
]
Show RLIMIT_STACK on this node.
supplementaryGroups [nodes
]
Show supplementaryGroups flag.
statusBroadcasts [nodes
]
Show the maximum number of status broadcasts initiated by lost connections to other daemon.
rdpTimeout [nodes
]
Show the RDP timeout configured in ms.
deadLimit [nodes
]
Show the dead-limit of the RDP status module. See also parastation.conf(5).
statusTimeout [nodes
]
Show the timeout of the RDP status module. See also parastation.conf(5).
rdpClosedTimeout [nodes
]
Show the closed timeout within the RDP facility in milliseconds. See also parastation.conf(5).
rdpResendTimeout [nodes
]
Show the resend timeout within the RDP facility in milliseconds. See also parastation.conf(5).
rdpMaxACKPend [nodes
]
Show the maximum ACK pending counter within the RDP facility. See also parastation.conf(5).
rdpStatistics [nodes
]
Show if RDP statistics are currently collected.
allowUserMap [nodes
]
Show flag marking if this nodes will allow user to influence the mapping of processes to physical core.
maxStatTry [nodes
]
Show the maximum number of tries to stat() an executable while spawning new processes.
accountPoll [nodes
]
Show polling interval in seconds of accounter to retrieve more detailed information.
sleep [sec
]
Sleep for sec
seconds before
continuing to parse the input.
version
Print various version numbers.
environment { list } [nodes
]
Manage the ParaStation daemon environment.
list [key env
] [nodes
]
List the daemons environment variable
env
and its value on the
selected nodes.
If the option key env
is
omitted, the entire environment will be displayed.
echo string
Echo the given string
to stdout.
This command does not support control sequences like its
counterpart /bin/echo.
Some directives are only available for privileged users, i.e. only root can execute these directives.
add [nodes
]
Start the ParaStation MPI daemon psid(8) on the selected node(s).
add only tries to start the ParaStation MPI daemon on the selected node(s). If it is not possible to start the daemon, no error message occurs. The current status of the nodes can be checked using the list directive.
hwstart [hw { hw
| all }
] [nodes
]
Start the declared hardware on the selected nodes.
Starting a specific hardware will be tried on the selected nodes
regardless, if this hardware is specified for this nodes within
the parastation.conf
configuration file or
not. On the other hand, if hw all
is specified
or the hw
option is missing at all, only the
hardware types specified within the configuration file are
started.
Starting or stopping a specific communication hardware only
concerns the ParaStation MPI part of hardware handling. I.e. stopping
ethernet
hardware should not touch the
normal IP traffic running over this specific device.
hwstop [hw { hw
| all }
] [nodes
]
Stop the declared hardware on the selected nodes.
If hw all
is specified or the
hw
option is missing at all, all running
hardware for this node is stopped.
Starting or stopping a specific communication hardware only
concerns the ParaStation MPI part of hardware handling. I.e. stopping
ethernet
hardware should not touch the
normal IP traffic running over this specific device.
resolve [nodes
]
Resolves a list of IDs to node names.
Nodes
selects one or more ranges of nodes.
Nodes
is either of the form
s1[-e1]{,si[-ei]}*, where the s and e are positiv
numbers representing ParaStation MPI IDs, or 'all'.
Each comma-separated part of
nodes
denotes a range of
nodes. If a range's '-e' part is missing, it represents
a single node.
In principle nodes
might
contain an unlimited number of ranges.
If nodes
value is 'all', all
nodes of the ParaStation cluster are selected.
If nodes
is empty, the node
range preselected via the 'range' command is used.
The default preselected node range contains all nodes of
the ParaStation MPI cluster.
As an extension nodes
might
also be a hostname that can be resolved into a valid
ParaStation MPI ID.
reset [hw] [nodes
]
Reset the ParaStation MPI daemon on all selected node(s). As a consequence all processes using the selected node(s) are killed!
If the option hw
is given, additionally the
communication hardware is brought into a known state. Executing
reset hw
is the same as using
restart.
restart [nodes
]
Restart the ParaStation MPI system on all selected node(s). This includes re-initialization of the communication hardware. On the selected node(s) the ParaStation MPI daemon processes are forced to reinitialize the ParaStation MPI cluster. As a consequence all processes using the selected node(s)s are killed!
This is the same as using reset
hw
.
set { maxproc { num
| any }
| user
[ + | - ]
{ name
| any }
| group
[ + | - ]
{ name
| any }
| psiddebug
mask
| master
id
| selecttime
time
| statusTimeout
ms
| statusBroadcasts
num
| deadLimit
num
| rdpdebug
mask
| rdpTimeout
ms
| rdpmaxretrans
val
| rdpResendTimeout
ms
| rdpRetrans
count
| rdpClosedTimeout
ms
| rdpMaxACKPend
num
| rdpStatistics
bool
| mcastdebug
mask
| freeOnSuspend { 0 | 1 }
| handleOldBins { 0 | 1 }
| starter { 0 | 1 }
| runjobs { 0 | 1 }
| overbook { 0 | 1 }
| exclusive
bool
| pinprocs
bool
| bindmem
bool
| supplementaryGroups
bool
| maxStatTry
num
| cpumap
map
| allowUserMap
bool
| nodesSort { PROC | LOAD_1 | LOAD_5 | LOAD_15 | PROC+LOAD | NONE }
| adminuser
[ + | - ]
{ name
| any }
| admingroup
[ + | - ]
{ name
| any }
} [nodes
]
Modify various parameters of the ParaStation MPI system.
adminuser [ + | - ] { name
| any } [nodes
]
Grant authorization to start admin-tasks, i.e.
task not blocking a dedicated CPU, to a particular
or any user.
Name
might be a user
name or a numerical UID.
If name
is preceded by
a '+' or '-', this user is added to or removed
from the list of adminusers respectively.
admingroup [ + | - ] { name
| any } [nodes
]
Grant authorization to start admin-tasks, i.e.
task not blocking a dedicated CPU, to a particular
or any group.
Name
might be a group
name or a numerical GID.
If name
is preceded by
a '+' or '-', this group is added to or removed
from the list of admingroups respectively.
user [ + | - ] { name
| any } [nodes
]
Grant exclusive access on the selected node(s) to the
special user name
or to any user.
If name
is preceded by
a '+' or '-', this user is added to or removed
from the list of users respectively.
group [ + | - ] { name
| any } [nodes
]
Grant exclusive access on the selected node(s) to the
special group name
or to any group.
If name
is preceded by
a '+' or '-', this group is added to or removed
from the list of groups respectively.
maxproc { num
| any } [nodes
]
Limit the number of running ParaStation MPI processes on the
selected node(s) to num
or remove the
limit.
selecttime time
[nodes
]
Set the timeout of the central select(2) of the ParaStation MPI daemon psid(8)
to time
seconds on the selected node(s).
This parameter can be set persistently via the SelectTime option within the ParaStation MPI configuration file parastation.conf(5).
master id
[nodes
]
Give the ParaStation daemon's some hints concerning
the master node. This will actually trigger the
daemon to connect the node with ParaStation MPI ID
id
.
psiddebug mask
[nodes
]
Set the debugging mask of the ParaStation MPI daemon
psid(8) to mask
on the
selected node(s).
Mask
is the bit-wise
disjunction of the following bit-patterns:
Table 2. psid debug flags
Pattern | Name | Description |
---|---|---|
0x0000001 | PSC_LOG_PART | Partitioning functions (i.e. PSpart_()) |
0x0000002 | PSC_LOG_TASK | Task structure handling (i.e. PStask_()) |
0x0000004 | PSC_LOG_VERB | Various, less interesting messages |
0x0000010 | PSID_LOG_SIGNAL | Signal handling |
0x0000020 | PSID_LOG_TIMER | Timer stuff |
0x0000040 | PSID_LOG_HW | Hardware stuff |
0x0000080 | PSID_LOG_RESET | Messages concerning (partial) resets |
0x0000100 | PSID_LOG_STATUS | Status determination |
0x0000200 | PSID_LOG_CLIENT | Client handling |
0x0000400 | PSID_LOG_SPAWN | Spawning clients |
0x0000800 | PSID_LOG_TASK | PStask_cleanup() call etc. |
0x0001000 | PSID_LOG_RDP | RDP messages |
0x0002000 | PSID_LOG_MCAST | Multicast messages |
0x0004000 | PSID_LOG_VERB | Higher verbosity (function call, etc.) |
0x0008000 | PSID_LOG_SIGDBG | More verbose signaling stuff |
0x0010000 | PSID_LOG_COMM | General daemon communication |
0x0020000 | PSID_LOG_OPTION | Option handling |
0x0040000 | PSID_LOG_INFO | Handling of info request messages |
0x0080000 | PSID_LOG_PART | Partition creation and management |
0x0100000 | PSID_LOG_ECHO | Echo each line to parse |
0x0200000 | PSID_LOG_FILE | Logs concerning the file to parse |
0x0400000 | PSID_LOG_CMNT | Comment handling |
0x0800000 | PSID_LOG_NODE | Info concerning each node |
0x1000000 | PSID_LOG_RES | Info on various resource to define |
0x2000000 | PSID_LOG_VERB | More verbose stuff |
This parameter can be set persistently via the LogMask option within the ParaStation MPI configuration file parastation.conf(5).
rdpdebug mask
[nodes
]
Set the debugging mask of the RDP protocol within the
ParaStation MPI daemon psid(8) to
mask
on the selected node(s).
Unless you want to debug the RDP protocol (i.e. the secure protocol used by the daemons to talk to each other) this parameter is not really useful.
Mask
is the bit-wise
disjunction of the following bit patterns:
Table 3. RDP debug flags
Pattern | Name | Description |
---|---|---|
0x0001 | RDP_LOG_CONN | Uncritical errors on connection loss |
0x0002 | RDP_LOG_INIT | Info from initialization (IP, FE, NFTS etc.) |
0x0004 | RDP_LOG_INTR | Interrupted syscalls |
0x0008 | RDP_LOG_DROP | Message dropping and resequencing |
0x0010 | RDP_LOG_CNTR | Control messages and state changes |
0x0020 | RDP_LOG_EXTD | Extended reliable error messages (on linux) |
0x0040 | RDP_LOG_COMM | Sending and receiving of data (huge! amount) |
0x0080 | RDP_LOG_ACKS | Resending and acknowledging (huge! amount) |
mcastdebug mask
[nodes
]
Set the debugging mask of the MCast protocol within
the ParaStation MPI daemon psid(8) to
mask
on the selected node(s).
Unless you want to debug the MCast protocol (i.e. the protocol used by the daemons to ping alive-messages to each other) this parameter is not really useful.
Mask
is the bit-wise
disjunction of the following bit patterns:
Table 4. Multicast debug flags
Pattern | Name | Description |
---|---|---|
0x0001 | MCAST_LOG_INIT | Info from initialization (IP etc.) |
0x0002 | MCAST_LOG_INTR | Interrupted syscalls |
0x0004 | MCAST_LOG_CONN | T_CLOSE and new pings |
0x0008 | MCAST_LOG_5MIS | Every 5th missing ping |
0x0010 | MCAST_LOG_MSNG | Every missing ping |
0x0020 | MCAST_LOG_MSNG | Every received ping |
0x0040 | MCAST_LOG_SENT | Every sent ping |
freeOnSuspend [ 0 | 1 ] [nodes
]
Switch the freeOnSuspend
flag on
or off on the selected nodes.
The freeOnSuspend
flag steers the
behavior of the resource management concerning suspended
jobs. Basically there are two possible approaches: Either
the resources used by the suspended job are freed for other
jobs (this is done, if the flag is set to 1) or they are
kept occupied in order to preserve them exclusively for the
time the job continues to run (this is the
behavior as long as the flag has the value 0).
Since the master node does all the resource management within the cluster, only the value on this node actually steers the behavior.
This flag can be set persistently via the freeOnSuspend option within the ParaStation MPI configuration file parastation.conf(5).
handleOldBins [ 0 | 1 ] [nodes
]
Switch the compatibility flag for applications linked against version 4.0.x of ParaStation MPI on or off on the selected nodes.
nodesSort { PROC | LOAD_1 | LOAD_5 | LOAD_15 | PROC+LOAD | NONE } [nodes
]
Define the default sorting strategy for nodes when attaching them to a partition. The different possible values have the following meaning:
PROC
Sort by the number of processes managed by ParaStation MPI on the corresponding nodes
LOAD_1
Sort by the load average during the last minute on the corresponding nodes
LOAD_5
Sort by the load average during the last 5 minutes on the corresponding nodes
LOAD_15
Sort by the load average during the last 15 minutes on the corresponding nodes
PROC+LOAD
Sort conforming to the sum of the processes managed by ParaStation MPI and the load average during the last minute on the corresponding nodes
NONE
Do not sort at all.
This only comes into play, if the user does not define
a sorting strategy explicitly via
PSI_NODES_SORT
. Be aware of the fact that
using a batch-system like PBS or LSF *will* set the
strategy explicitly, namely to NONE.
overbook [ 0 | 1 ] [nodes
]
Define if this nodes shall be overbooked upon user-request (if flag is true) or if overbooking should be denied at all (false).
starter [ 0 | 1 ] [nodes
]
Define if starting jobs from this nodes should allowed (flag is true) or denied (false).
runjobs [ 0 | 1 ] [nodes
]
Define if running tasks on this nodes should be allowed (flag is true) or denied (false).
rdpmaxretrans val
[nodes
]
Set RDP protocol's maximum retransmission count.
exclusive [ 0 | 1 ] [nodes
]
Set flag marking if this nodes can be
requested by users exclusively to
bool
.
Relevant values are 'false', 'true', 'no', 'yes',
0 or different from 0.
pinprocs [ 0 | 1 ] [nodes
]
Set flag marking if this nodes will use process-pinning to bind processes to cores. Relevant values are 'false', 'true', 'no', 'yes', 0 or different from 0.
bindmem [ 0 | 1 ] [nodes
]
Set flag marking if this nodes will use memory-binding as NUMA policy. Relevant values are 'false', 'true', 'no', 'yes', 0 or different from 0.
cpumap map
[nodes
]
Set the map used to assign CPU-slots to
physical cores to map
.
Map
is a quoted string
containing a space-separated permutation of the
number 0 to Ncore
-1. Here
Ncore
is the number of
physical cores available on this node. The number
of cores within a distinct node may be determined
via 'list hw'. The first number in
map
is the number of
the physical core the first CPU-slot will be
mapped to, and so on.
allowUserMap [ 0 | 1 ] [nodes
]
Set flag marking if this nodes will allow user to influence the mapping of processes to physical core. Relevant values are 'false', 'true', 'no', 'yes', 0 or different from 0.
supplementaryGroups [ 0 | 1 ] [nodes
]
The supplementaryGroups flag defines whether a process spawned should belong to all groups (true) defined for this user or only to the primary group (false). Relevant values are 'false', 'true', 'no', 'yes', 0 or different from 0.
maxStatTry num [nodes
]
Set the maximum number of tries to stat() an
executable while spawning new processes to
num
.
All numbers larger than 0 are allowed.
statusBroadcasts [ num ] [nodes
]
Set the maximum number of status broadcasts initiated by lost connections to other daemons. See also parastation.conf(5).
rdpTimeout [ ms ] [nodes
]
Set the RDP timeout in ms for all selected nodes. See also parastation.conf(5).
deadLimit [ num ] [nodes
]
Set the dead-limit of the RDP status module. After this number of consecutively missing RDP-pings, the master declares the node to be dead. Only relevant, if MCast is *not* used. See also parastation.conf(5).
statusTimeout [ ms ] [nodes
]
Set the Timeout of the RDP status module. After this number of milliseconds a RDP-ping is sent to the master daemon. Additionally, the master daemon checks for received ping-messages. Only relevant, if MCast is *not* used. See also parastation.conf(5).
rdpClosedTimeout [ ms ] [nodes
]
Set the RDP closed timeout of the RDP status module. See also parastation.conf(5).
rdpResendTimeout [ ms ] [nodes
]
Set the RDP resend timeout of the RDP status module. See also parastation.conf(5).
rdpRetrans [ count ] [nodes
]
Set RDP protocol's total retransmission count. Most probably you want to reset this to 0.
rdpMaxACKPend [ num ] [nodes
]
Set the maximum number of pending ACKs within the RDP facility. See also parastation.conf(5).
rdpStatistics [ bool ] [nodes
]
Turn on or off collecting RDP statistics.
shutdown [nodes
]
Shutdown the ParaStation MPI daemon on all selected node(s). As a consequence all processes using the selected node(s) are killed!
test [ quiet | normal | verbose ]
All communications links in a ParaStation MPI network are tested.
quiet
Quiet execution. Only a short message is printed if the test was successful.
normal
Normal execution with some messages during runtime. This is the default.
verbose
Very verbose execution with many message during runtime.
environment [ set | unset ] [nodes
]
Manage the ParaStation daemon environment.
set key env
[nodes
]
Set the environment variable
key
to
env
on the selected
nodes. Env
might get
quoted in order to include whitespace characters.
unset key [nodes
]
Unset the environment variable
key
on the selected
nodes.
Upon startup, psiadmin tries to find .psiadminrc
in
the current directory or in the user's home directory. The first file
found is parsed and the directives within are executed. Afterwards
psiadmin goes into interactive mode unless the -f
is
used.
This file might be used to set some default ranges whenever psiadmin is invoked.
The startup file is ignored if the option -c
is used.