Warning: PSI barrier timeout

The warning

  PSIlogger: Timeout: Not all clients joined the first pmi barrier: joined=337 left=175 round=1

is output during job startup.

The PMI protocol just throws a warning. It just reports that after the first round of timeouts (which is 60 seconds plus 500 usec per process, i.e. in this case 60.256 sec) not all processes have joined the first barrier. Since there are no more warnings after that, the remaining 175 processes have joined during the second timeout-period. The slow startup might be due to network or file system problems.

See also ps_environment(5), PMI_BARRIER_TMOUT and PMI_BARRIER_ROUNDS.