Hi,
Looks like the issue is fixed, I've not been getting the error for the last 2 days.
The following were the findings,
we have 2 domain controllers - 1 primary and 2 backup (secondary)
the were trying to resolve the secondary at times and couldn't reach the same and was giving the error.
Now i've added the secondary domain controller in the /etc/ldap.conf for the compute nodes and so far the issue is not revoked.
Keeping the fingers crossed.
Thanks for your help Reuti.
--- On Wed, 15/12/10, llikethat <***@yahoo.com> wrote:
From: llikethat <***@yahoo.com>
Subject: Re: [GE users] can't get password entry for user "www"
To: ***@gridengine.sunsource.net
Date: Wednesday, 15 December, 2010, 9:19 PM
Hi,
the node are also Linux - or Windows?
The nodes are all on Linux CentOS
Is it only this user? At least in NIS it can be set up, from which UID the NIS should deliver the users at all to the clients. For the lower ones the local users will be taken. Maybe it's similar in ADS.
It is not the only user. We are using a Web-Interface to submit jobs - which uses www as the user. But at times the same error reflects for the sgeadmin user too. I have not configured the users locally (both www and sgeadmin). I used to do this when there was not authentication mechanism added to the compute nodes. Now we have an
ADS, the same users in the ADS are used in the master and the compute nodes using LDAP.
I think it's not bound to some particular nodes, but depends on the load of the ADS server, or what they deliver. I saw this only once, when the NIS server in our case couldn't be contacted.
Yes, that's right it is not bound to any particular node. This error occurs randomly. The only thing that comes to my mind is the load. But will 72 nodes with slots 4 = 288 requests will create such a huge load.
Now i'm looking at the load on the ADS, will get back if i get an answer for this.
Thanks,
--- On Tue, 14/12/10, reuti
<***@staff.uni-marburg.de> wrote:
From: reuti <***@staff.uni-marburg.de>
Subject: Re: [GE users] can't get password entry for user "www"
To: ***@gridengine.sunsource.net
Date: Tuesday, 14 December, 2010, 7:14 PM
Hi,
Post by llikethatI'm using SGE 6.2u5 on linux CentOS. The authentication is done using ADS. There are totally 70nodes running with proper configuration. The configuration for the nodes are defined locally. The spool directory is also local.
the node are also Linux - or Windows?
Is it only this user? At least in NIS it can be set up, from which UID the NIS should deliver the users at all to the clients. For the lower ones the local users will be taken. Maybe it's similar in ADS.
Post by llikethatBut some of the nodes give the following error>
can't get password entry for user "www"Â and the nodes are marked in Error state. When I check the nodes manually if they are able to get the list of users from the ADS, they seem to be fine. (i get all the users including www from the Domain)
I think it's not bound to some particular nodes, but depends on the load of the ADS server, or what they deliver. I saw this only once, when the NIS server in our case couldn't be contacted.
Yes
Post by llikethatIt is not only this but the local configuration also gets lost, if i do a qconf -sconf <hostname> it says the configuration is not defined.
Any suggestions will be very helpful.
I think it's not bound to some particular nodes, but depends
on the load of the ADS server, or what they deliver. I saw this only once, when the NIS server in our case couldn't be contacted.
-- Reuti
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305454
To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=307695
To unsubscribe from this discussion, e-mail: [users-***@gridengine.sunsource.net].