Chapter 12. Troubleshooting

Table of Contents

Problem: GridMonitor GUI shows no values for temperatures or fan speeds
Problem: negative CPU temperatures shown
Problem: history charts report errors
Problem: GridMonitor GUI shows no batch jobs
Problem: empty Select Queue menu
Problem: no ParaStation job list shown

This chapter provides some hints to problems seen while installing or using the ParaStation GridMonitor. For addition help, please contact .

Problem: GridMonitor GUI shows no values for temperatures or fan speeds

Description: when clicking on node-specific pages, no or not all values for node temperatures or fan speeds are shown.

Solution: when using the lmsensors package to read sensors data, check for the sensors output on each node. The lmsensors package must be installed and configured to return correct values for certain parameters.

The output of sensors should look like this:

    VCore 1:   +1.42 V  (min =  +1.42 V, max =  +1.57 V)
    VCore 2:   +3.31 V  (min =  +1.42 V, max =  +1.57 V)
    +3.3V:     +3.26 V  (min =  +3.14 V, max =  +3.47 V)
    +5V:       +4.95 V  (min =  +4.76 V, max =  +5.24 V)
    +12V:     +12.04 V  (min = +10.82 V, max = +13.19 V)
    -12V:     -11.46 V  (min = -13.18 V, max = -10.80 V)
    -5V:       -2.13 V  (min =  -5.25 V, max =  -4.75 V)
    V5SB:      +5.38 V  (min =  +4.76 V, max =  +5.24 V)
    VBat:      +3.84 V  (min =  +2.40 V, max =  +3.60 V)
    fan1:     5314 RPM  (min = 2848 RPM, div = 2)
    fan2:     5075 RPM  (min = 2848 RPM, div = 2)
    fan3:     3409 RPM  (min = 1424 RPM, div = 4)
    temp1:       +45C  (high =   +65C, hyst =   +60C)
    temp2:     +25.0C  (high =   +85C, hyst =   +80C)
    temp3:     +24.5C  (high =   +85C, hyst =   +80C)
      

Example 12.1. Sample sensors output


If your system does not support at least 2 fans (fan1, fan2) and 3 temperatures (temp1, temp2, temp3), the missing values are reported as '0'. This may lead to unexpected events and alarms.

When using IPMI to read sensor data, check the command ipmitool for proper sensor data output:

    master: # ipmitool -I lan -H node01-bmc -U root -P root sdr list
    Temp             | -58 degrees C     | ok
    Temp             | -56 degrees C     | ok
    Temp             | 40 degrees C      | ok
    Temp             | 40 degrees C      | ok
    Ambient Temp     | 20 degrees C      | ok
    CMOS Battery     | 0x00              | ok
    …
      

Example 12.2. Sample IPMI output


Check for proper mapping of real to virtual sensors, see the section called “Configuring the collector – step 6” for details.