Discussion:
SGE_Hadoop_Error
adarsh
2010-11-25 12:23:10 UTC
Permalink
Hi all,

I am getting confused about getting errors while configuring SGE with Hadoop.

I configured it properly on 4 nodes.
But on the other day when I tried to configure on new 4 nodes, I face some issues.

my sge_hadoop1.log says :

11/25/2010 17:44:55| main|ws34-rak-lin|W|[load_sensor 5997] fflush failed [Broken pipe]
11/25/2010 17:44:56| main|ws34-rak-lin|W|load sensor exited with exit status = 127

And there are no logs on other nodes.
Please help.

Thanks & Regards
Adarsh Sharma

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=298690

To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].
templedf
2010-11-29 01:00:19 UTC
Permalink
First, what version of Hadoop are you using?

With just that little bit of information, it sounds like maybe the execd
went down, causing the load sensor to lose its output stream.

Is this a recurring problem? If the load sensor fails, the execd should
just restart it. When that happens, is it simply failing again?

Daniel
Post by adarsh
Hi all,
I am getting confused about getting errors while configuring SGE with Hadoop.
I configured it properly on 4 nodes.
But on the other day when I tried to configure on new 4 nodes, I face some issues.
11/25/2010 17:44:55| main|ws34-rak-lin|W|[load_sensor 5997] fflush failed [Broken pipe]
11/25/2010 17:44:56| main|ws34-rak-lin|W|load sensor exited with exit status = 127
And there are no logs on other nodes.
Please help.
Thanks& Regards
Adarsh Sharma
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=298690
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=300051

To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].
Loading...