[gpfsug-discuss] Executing Callbacks on other Nodes

Roland Pabel dr.roland.pabel at gmail.com
Tue Apr 12 14:25:33 BST 2016


Hi Bob,

thanks for your remarks. I already understood that deadlocks are more timeouts 
than "tangled up balls of code". I was not (yet) planning on changing the 
whole routine, I'd just like to get a notice when something unexpected happens 
in the cluster. So, first, I just want to write these notices into a file and 
email it once it reaches a certain size.

>From what you are saying, it sounds like it is worth upgrading to 4.1.1.x . We 
are planning a maintenance next month, I'll try to get this into the todo-
list. Upgrading beyond this is going require a longer preparation, unless the 
prerequisite of "RHEL 6.4 or later" as stated on the IBM FAQ is irrelevant. 
Our clients still run RHEL 6.3. 

Best regards,

Roland

> Some general thoughts on “deadlocks” and automated deadlock detection.
> 
> I personally don’t like the term “deadlock” as it implies a condition that
> won’t ever resolve itself. In GPFS terms, a deadlock is really a “long RPC
> waiter” over a certain threshold. RPCs that wait on certain events can and
> do occur and they can take some time to complete. This is not necessarily a
> condition that is a problem, but you should be looking into them.
 
> GPFS does have automated deadlock detection and collection, but in the early
> releases it was … well.. it’s not very “robust”. With later releases (4.2)
> it’s MUCH better. I personally don’t rely on it because in larger clusters
> it can be too aggressive and depending on what’s really going on it can
> make things worse. This statement is my opinion and it doesn’t mean it’s
> not a good thing to have. :-)
 
> On the point of what commands to execute and what to collect – be careful
> about long running callback scripts and executing commands on other nodes.
> Depending on what the issues is, you could end up causing a deadlock or
> making it worse. Some basic data collection, local to the node with the
> long RPC waiter is a good thing. Test them well before deploying them. And
> make sure that you don’t conflict with the automated collections. (which
> you might consider turning off)
 
> For my larger clusters, I dump the cluster waiters on a regular basis (once
> a minute: mmlsnode –N waiters –L), count the types and dump them into a
> database for graphing via Grafana. This doesn’t help me with true deadlock
> alerting, but it does give me insight into overall cluster behavior. If I
> see large numbers of long waiters I will (usually) go and investigate them
> on a cases by case basis. If you have large numbers of long RPC waiters on
> an ongoing basis, it's an indication of a larger problem that should be
> investigated. A few here and there is not a cause for real alarm in my
> experience.
 
> Last – if you have a chance to upgrade to 4.1.1 or 4.2, I would encourage
> you to do so as the deadlock detection has improved quite a bit.
 
> Bob Oesterlin
> Sr Storage Engineer, Nuance HPC Grid
> robert.oesterlin at nuance.com
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spe
> ctrumscale.org>> on behalf of Roland Pabel
> <dr.roland.pabel at gmail.com<mailto:dr.roland.pabel at gmail.com>>
> Organization: RRZK Uni Köln
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
> 
 Date: Tuesday, April 12, 2016 at 3:03 AM
> To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
> 
 Subject: [gpfsug-discuss] Executing Callbacks on other Nodes
> 
> Hi everyone,
> 
> we are using GPFS 4.1.0.8 with 4 servers and 850 clients. Our GPFS setup is
> fairly new, we are still in the testing phase. A few days ago, we had some
> problems in the cluster which seemed to have started with deadlocks on a
> small number of nodes. To be better prepared for this scenario, I would
> like to install a callback for Event deadlockDetected. But this is a local
> event and the callback is executed on the client nodes, from which I cannot
> even send an email.
> 
> Is it possible using mm-commands to instead delegate the callback to the
> servers (Nodeclass nsdNodes)?
> 
> I guess it would be possible to use a callback of the form "ssh nsd0
> /root/bin/deadlock-callback.sh", but then it is contingent upon server nsd0
> being available. The mm-command style "-N nsdNodes" would more reliable in
> my opinion, because it would be run on all servers. On the servers, I can
> then check to actually only execute the script on the cluster manager. 
> Thanks
> 
> Roland
> --
> Dr. Roland Pabel
> Regionales Rechenzentrum der Universität zu Köln (RRZK)
> Weyertal 121, Raum 3.07
> D-50931 Köln
> 
> Tel.: +49 (221) 470-89589
> E-Mail: pabel at uni-koeln.de<mailto:pabel at uni-koeln.de>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listi
> nfo_gpfsug-2Ddiscuss&d=CwIFAw&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&
> r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=c7jzNm-H6SdZMztP1xkwgySivoe4
> FlOcI2pS2SCJ8K8&s=AfohxS7tz0ky5C8ImoufbQmQpdwpo4wEO7cSCzHPCD0&e=
 

-- 
Dr. Roland Pabel
Regionales Rechenzentrum der Universität zu Köln (RRZK)
Weyertal 121, Raum 3.07
D-50931 Köln

Tel.: +49 (221) 470-89589
E-Mail: pabel at uni-koeln.de



More information about the gpfsug-discuss mailing list