adarsh
2010-11-29 05:40:08 UTC
Dear all,
I am using Hadoop-0.20.2 on 4 nodes ( 1 ( master/qmaster ) & 3 ( Slaves/sgeexecd hosts ).
QMON shows all nodes with their loads.
./sgeexecd command shows
***@ws30-pank-lin:~# ps aux | grep sge
sgeadmin 3673 0.2 0.0 49264 2112 ? Sl 10:41 0:00 /opt/sge-root/bin/lx24-amd64/sge_execd
root 3688 0.0 0.0 7620 888 pts/0 S+ 10:41 0:00 grep --color=auto sge
but in their logs/messages all execution host shows
11/29/2010 10:08:53| main|ws36-test-lin|I|starting up GE 6.2u5 (lx24-amd64)
11/29/2010 10:20:54| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:21:33| main|ws36-test-lin|W|[load_sensor 4724] fflush failed [Broken pipe]
11/29/2010 10:21:34| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:24:13| main|ws36-test-lin|W|[load_sensor 4770] fflush failed [Broken pipe]
11/29/2010 10:27:33| main|ws36-test-lin|W|[load_sensor 4830] fflush failed [Broken pipe]
11/29/2010 10:27:34| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:28:13| main|ws36-test-lin|W|[load_sensor 4845] fflush failed [Broken pipe]
11/29/2010 10:28:14| main|ws36-test-lin|W|load sensor exited with exit status = 127
I followed Hadoop Troubleshooting Guide in SGE
and my qhost -F | grep hdfs command shows nothing.
[***@ws37-mah-lin lx24-amd64]# ./qhost -F | grep hdfs
[***@ws37-mah-lin lx24-amd64]#
I think my Sge didn't configured properly but qhost command works properly simple.sh run completely.
[***@ws37-mah-lin lx24-amd64]# ./qhost
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - - - - -
ws34-rak-lin lx24-amd64 4 0.18 5.7G 1.3G 4.0G 0.0
ws36-test-lin lx24-amd64 4 0.08 7.7G 736.8M 15.8G 0.0
ws37-user-lin lx24-amd64 4 0.04 7.7G 407.9M 15.8G 0.0
Thanks & Regards
Adarsh Sharma
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=300123
To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].
I am using Hadoop-0.20.2 on 4 nodes ( 1 ( master/qmaster ) & 3 ( Slaves/sgeexecd hosts ).
QMON shows all nodes with their loads.
./sgeexecd command shows
***@ws30-pank-lin:~# ps aux | grep sge
sgeadmin 3673 0.2 0.0 49264 2112 ? Sl 10:41 0:00 /opt/sge-root/bin/lx24-amd64/sge_execd
root 3688 0.0 0.0 7620 888 pts/0 S+ 10:41 0:00 grep --color=auto sge
but in their logs/messages all execution host shows
11/29/2010 10:08:53| main|ws36-test-lin|I|starting up GE 6.2u5 (lx24-amd64)
11/29/2010 10:20:54| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:21:33| main|ws36-test-lin|W|[load_sensor 4724] fflush failed [Broken pipe]
11/29/2010 10:21:34| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:24:13| main|ws36-test-lin|W|[load_sensor 4770] fflush failed [Broken pipe]
11/29/2010 10:27:33| main|ws36-test-lin|W|[load_sensor 4830] fflush failed [Broken pipe]
11/29/2010 10:27:34| main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:28:13| main|ws36-test-lin|W|[load_sensor 4845] fflush failed [Broken pipe]
11/29/2010 10:28:14| main|ws36-test-lin|W|load sensor exited with exit status = 127
I followed Hadoop Troubleshooting Guide in SGE
and my qhost -F | grep hdfs command shows nothing.
[***@ws37-mah-lin lx24-amd64]# ./qhost -F | grep hdfs
[***@ws37-mah-lin lx24-amd64]#
I think my Sge didn't configured properly but qhost command works properly simple.sh run completely.
[***@ws37-mah-lin lx24-amd64]# ./qhost
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - - - - -
ws34-rak-lin lx24-amd64 4 0.18 5.7G 1.3G 4.0G 0.0
ws36-test-lin lx24-amd64 4 0.08 7.7G 736.8M 15.8G 0.0
ws37-user-lin lx24-amd64 4 0.04 7.7G 407.9M 15.8G 0.0
Thanks & Regards
Adarsh Sharma
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=300123
To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].