— Administrator's Guide

Release 1.9.2-2

September 2010

Reproduction in any manner whatsoever without the written permission of ParTec Cluster Competence Center GmbH is strictly forbidden.

All rights reserved. ParTec and ParaStation are registered trademarks of ParTec Cluster Competence Center GmbH. The ParTec logo, the ParaStation logo and the ParaStation GridMonitor logo are trademarks of ParTec Cluster Competence Center GmbH. Linux is a registered trademark of Linus Torvalds. All other marks and names mentioned herein may be trademarks of their respective companies.

This document provides detailed information about the ParaStation GridMonitor. Installation and configuration of the ParaStation GridMonitor as well as usage of the ParaStation GridMonitor commands and graphical user interface are explained in-depth.

Though it may seem hard to believe, this manual might contain errors. We welcome any reports on errors or problems that are found. We also would appreciate suggestions on improving this book. Please direct all comments and problems to .

The most up-to-date version of this document is available at http://docs.par-tec.com.

 

Share your knowledge with others. It's a way to achieve immortality.

 
 --Dalai Lama


Table of Contents

1. Preface
About this book
This book's audience
ParaStation GridMonitor overview
I. Setting up the GridMonitor
2. Introduction
What is the ParaStation GridMonitor?
Data collection process: collector
Graphical client (GridMonitor GUI)
3. Installation
Installation prerequisites
Installing the collector
Installing the GridMonitor GUI
Installing the documentation
Uninstalling the GridMonitor
4. Configuration
Default configuration
Configuring the collector – step by step
Configuring the collector – basics
Configuring the collector – step 1
Configuring the collector – step 2
Configuring the collector – step 3
Configuring the collector – step 4
Configuring the collector – step 5
Configuring the collector – step 6
Configuring the collector – step 7
Configuring the collector – step 8
Configuring the collector – step 9
Configuring the collector – step 10
Configuring the collector – additional steps
License
Configuring the collector – enable pscd auto-login
Configuring the collector – enable local IPMI access
Configuring the collector – enable disk monitoring
Configuring the graphical user interface (GridMonitor GUI)
Configuring basic GridMonitor GUI parameters
Configuring GridMonitor GUI physical view
Configuring GridMonitor GUI default values
Configuring cluster pictures within the GridMonitor GUI
5. Maintenance
Parameter database
Event database
Logfile
II. Using the GridMonitor graphical user interface
6. GridMonitor GUI: Navigation
General hints
Topbar and left hand navigation area
View configuration
Diagrams
Links to more details
7. GridMonitor GUI: Overview page
8. GridMonitor GUI: Cluster pages
GridMonitor GUI: Cluster overview page
GridMonitor GUI: Cluster physical view page
GridMonitor GUI: Cluster events page
GridMonitor GUI: ParaStation jobs page
GridMonitor GUI: Batch queuing system information
9. GridMonitor GUI: Node pages
GridMonitor GUI: Node overview page
GridMonitor GUI: Node sensors page
GridMonitor GUI: Node common page
GridMonitor GUI: Node memory page
GridMonitor GUI: Node process list page
GridMonitor GUI: Node network page
GridMonitor GUI: Node mount page
GridMonitor GUI: Node disk space page
GridMonitor GUI: Node ParaStation counters page
GridMonitor GUI: Node Infinipath counters and statistics page
GridMonitor GUI: Node kernel modules page
10. GridMonitor GUI: SNMP (switch) pages
GridMonitor GUI: Switch Overview page
GridMonitor GUI: Switch LinkLayer page
GridMonitor GUI: Switch Info page
GridMonitor GUI: Switch System page
11. GridMonitor GUI: Parameter browser page
GridMonitor GUI: Parameter browser select boxes
GridMonitor GUI: Parameter browser sorting
GridMonitor GUI: Parameter browser diagrams
III. Additional information
12. Troubleshooting
Problem: GridMonitor GUI shows no values for temperatures or fan speeds
Problem: negative CPU temperatures shown
Problem: history charts report errors
Problem: GridMonitor GUI shows no batch jobs
Problem: empty Select Queue menu
Problem: no ParaStation job list shown
I. Reference Pages
pscollect — the ParaStation GridMonitor data collecting process.
psvalue — the ParaStation GridMonitor default agent for compute nodes
psget — retrieve data from the collecting process.
Glossary

List of Figures

2.1. Example GridMonitor GUI picture
6.1. Example left hand navigation area
6.2. Example topbar
6.3. Example view configuration area
6.4. Example icon area
6.5. Example diagram
8.1. Cluster overview
8.2. Cluster physical overview
8.3. Cluster events list
8.4. Batch system jobs list
9.1. Node overview
9.2. Node sensors
10.1. Switch overview
11.1. Parameter browser
15. Loadbar