Wednesday, May 29, 2013

Install Jobmonarch 1.1 with Ganglia 3.6.0

Well I stayed with Ganglia 3.0.7 for a long time, for the Jobmonarch plugin can't work with Ganglia above that version.
It has changed, however, for the plugin updated to version 1.0 (at 2013/4/12) and supports Ganglia above 3.4.0.
The following procedure is done on a CentOS 6.4 VM with Torque 3.6.0 already installed.


0.      Obtain files
Updated Jobmonarch works with minimum version of Ganglia 3.4.0 and GangliaWeb 3.5.0.
Ganglia can be found at http://ganglia.info/
Latest version (at 2013/5/28) is 3.6.0:
Latest GangliaWeb is 3.5.8:
Latest Jobmonarch is version 1.1, which can be found here:
Jobmonarch needs pbs_python to link with Torque. The latest versin is 4.3.5:
Ganglia installation on CentOS 6 needs libconfuse and rrdtool, which can be installed (in lazy way, lol) with EPEL repository.

1.      Install EPEL
rpm –ivh epel-release-6-8.noarch.rpm
2.      Install Ganglia client
Install dependency packages first:
yum install gcc apr-devel libconfuse-devel memcached-devel expat-devel pcre-devel zlib-devel make
Extract the source file:
tar xvzf ganglia-3.6.0.tar.gz
cd ganglia-3.6.0
./configure --sysconfdir=/etc --prefix=/usr
make
make install
cp gmond/gmond.init /etc/init.d/gmond
chkconfig --add gmond
gmond –t > /etc/gmond.conf
The gmond.conf is mostly in the same format as earlier versions, just set the host address in it. The last step:
service gmond start
3.      Install Ganglia server
Install dependency packages, plus rrdtool-devel:
yum install gcc apr-devel libconfuse-devel memcached-devel expat-devel pcre-devel zlib-devel make rrdtool-devel
tar xvzf ganglia-3.6.0.tar.gz
cd ganglia-3.6.0
./configure --sysconfdir=/etc --prefix=/usr --with-gmetad
make
make install
cp gmetad/gmetad.init /etc/init.d/gmetad
chkconfig --add gmetad
This version copies gmetad.conf into /etc with "make install", so no manual copy needed.
Just use default content in single cluster config.
Create the RRDTool database folder:
mkdir –p /var/lib/ganglia/rrds
chown –R nobody /var/lib/ganglia/rrds
Start the service:
service gmetad start
4.      Install Ganglia Web
The web frontend is now a separate package.
Install dependency packages:
yum install httpd php php-gd rsync
tar xvzf ganglia-web-3.5.8.tar.gz
cd ganglia-web-3.5.8
make install
chkconfig httpd on
service httpd start
Ganglia should work at this point. Test it with browser. Disable SELinux is problem encountered.
5.      Install pbs_python
yum install python-devel
tar xvjf ganglia_jobmonarch-1.1.tar.bz2
cd ganglia_jobmonarch-1.1
./configure --prefix=/usr --with-pbsdir=/usr/local/lib
make
make install
6.      Install Jobmonarch
Jobarchive (save jobs info in PostgreSQL DB for query later) is included but disabled as default. I'm keeping it that way.
tar xvjf ganglia_jobmonarch-1.1.tar.bz2
cd ganglia_jobmonarch-1.1
cp jobmond/jobmond.conf /etc
cp jobmond/jobmond.py /usr/sbin
cp pkg/rpm/init.d/jobmond /etc/init.d
cp pkg/rpm/sysconfig/jobmond /etc/sysconfig
chkconfig --add jobmond
Edit /etc/jobmond.conf, line 20:
BATCH_SERVER            : localhost
to:
BATCH_SERVER            : (FQDN of the Torque server)
Then line 55:
GMETRIC_TARGET          : 239.2.11.71:8649
to:
GMETRIC_TARGET          : (Internal IP of the Torque server):8649
Edit /etc/init.d/jobmond, line 18:
DAEMON=/usr/sbin/jobmond
to:
DAEMON=/usr/sbin/jobmond.py
The line 40:
killproc $DAEMON
to:
killproc –pidfile $PIDFILE
Edit /usr/sbin/jobmond.py, line 564:
GMOND_CONF          = '/etc/ganglia/gmond.conf'
to:
GMOND_CONF          = '/etc/gmond.conf'
Save and exit. Then:
cp web/* /var/www/html/ganglia -a
chown -R apache:apache /var/www/html/ganglia/addons/job_monarch/dwoo
Edit /var/www/html/ganglia/conf_default.php, line 29:
$conf['template_name'] = "default";
to:
$conf['template_name'] = "job_monarch";
Save and exit. Then:
cd /var/www/html/ganglia/addons/job_monarch
mv ../../conf.php.in ./conf.php
Edit conf.php, change line 24:
$GANGLIA_PATH = "__GANGLIA_ROOT__";
to:
$GANGLIA_PATH = "/var/www/html/ganglia";
Done. The plugin should work now.

3 comments:

Mike Chen said...

Update:
Tested with CentOS 5.9 scuuessfully.
The filter uses filter_input method in PHP 5.2.0, though.
CentOS 5.x uses PHP 5.1.6, so filter will not function.
Simply update to PHP 5.3 with:
yum remove php*
yum install php53 php53-gd
service httpd restart

Unknown said...

Have you tried this with sge on centos 6.3 with jobmonarch1.1.2 and ganglia 3.6 & ganglia-web-5.12
If you have done it successfully can you post that

Mike Chen said...

Hi,
I don't use SGE so I can't provide experience on that.
Although the SGE support in JobMonArch are listed as experimental, minimal config should works.

Have you tried to build the DRMAA-python (looks like the interface library, similar to pbs-python with PBS?)
http://osdir.com/ml/linux.cluster.oscar.devel/2008-01/msg00105.html