Would it help to lower the grace time?<br><br>mmnfs configuration change LEASE_LIFETIME=10<br>mmnfs configuration change GRACE_PERIOD=10<br><br><br><br>  -jf<br><div class="gmail_quote"><div dir="ltr">ons. 26. apr. 2017 kl. 16.20 skrev Simon Thompson (IT Research Support) <<a href="mailto:S.J.Thompson@bham.ac.uk">S.J.Thompson@bham.ac.uk</a>>:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Nope, the clients are all L3 connected, so not an arp issue.<br>

<br>

Two things we have observed:<br>

<br>

1. It triggers when one of the CES IPs moves and quickly moves back again.<br>

The move occurs because the NFS server goes into grace:<br>

<br>

2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN<br>

GRACE, duration 60<br>

2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server<br>

recovery event 2 nodeid -1 ip <CESIP><br>

2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs_release_v4_client :STATE :EVENT :NFS Server V4<br>

recovery release ip <CESIP><br>

2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs_in_grace :STATE :EVENT :NFS Server Now IN GRACE<br>

2017-04-25 20:37:42 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN<br>

GRACE, duration 60<br>

2017-04-25 20:37:44 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN<br>

GRACE, duration 60<br>

2017-04-25 20:37:44 : epoch 00040183 : <NODENAME> :<br>

ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server<br>

recovery event 4 nodeid 2 ip<br>

<br>

<br>

<br>

We can't see in any of the logs WHY ganesha is going into grace. Any<br>

suggestions on how to debug this further? (I.e. If we can stop the grace<br>

issues, we can solve the problem mostly).<br>

<br>

<br>

2. Our clients are using LDAP which is bound to the CES IPs. If we<br>

shutdown nslcd on the client we can get the client to recover once all the<br>

TIME_WAIT connections have gone. Maybe this was a bad choice on our side<br>

to bind to the CES IPs - we figured it would handily move the IPs for us,<br>

but I guess the mmcesfuncs isn't aware of this and so doesn't kill the<br>

connections to the IP as it goes away.<br>

<br>

<br>

So two approaches we are going to try. Reconfigure the nslcd on a couple<br>

of clients and see if they still show up the issues when fail-over occurs.<br>

Second is to work out why the NFS servers are going into grace in the<br>

first place.<br>

<br>

Simon<br>

<br>

On 26/04/2017, 00:46, "<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a> on behalf<br>

of Greg.Lehmann@csiro.au" <<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a> on<br>

behalf of Greg.Lehmann@csiro.au> wrote:<br>

<br>

>Are you using infiniband or Ethernet? I'm wondering if IBM have solved<br>

>the gratuitous arp issue which we see with our non-protocols NFS<br>

>implementation.<br>

><br>

>-----Original Message-----<br>

>From: <a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><br>

>[mailto:<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a>] On Behalf Of Simon<br>

>Thompson (IT Research Support)<br>

>Sent: Wednesday, 26 April 2017 3:31 AM<br>

>To: gpfsug main discussion list <<a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a>><br>

>Subject: Re: [gpfsug-discuss] NFS issues<br>

><br>

>I did some digging in the mmcesfuncs to see what happens server side on<br>

>fail over.<br>

><br>

>Basically the server losing the IP is supposed to terminate all sessions<br>

>and the receiver server sends ACK tickles.<br>

><br>

>My current supposition is that for whatever reason, the losing server<br>

>isn't releasing something and the client still has hold of a connection<br>

>which is mostly dead. The tickle then fails to the client from the new<br>

>server.<br>

><br>

>This would explain why failing the IP back to the original server usually<br>

>brings the client back to life.<br>

><br>

>This is only my working theory at the moment as we can't reliably<br>

>reproduce this. Next time it happens we plan to grab some netstat from<br>

>each side.<br>

><br>

>Then we plan to issue "mmcmi tcpack $cesIpPort $clientIpPort" on the<br>

>server that received the IP and see if that fixes it (i.e. the receiver<br>

>server didn't tickle properly). (Usage extracted from mmcesfuncs which is<br>

>ksh of course). ... CesIPPort is colon separated IP:portnumber (of NFSd)<br>

>for anyone interested.<br>

><br>

>Then try and kill he sessions on the losing server to check if there is<br>

>stuff still open and re-tickle the client.<br>

><br>

>If we can get steps to workaround, I'll log a PMR. I suppose I could do<br>

>that now, but given its non deterministic and we want to be 100% sure<br>

>it's not us doing something wrong, I'm inclined to wait until we do some<br>

>more testing.<br>

><br>

>I agree with the suggestion that it's probably IO pending nodes that are<br>

>affected, but don't have any data to back that up yet. We did try with a<br>

>read workload on a client, but may we need either long IO blocked reads<br>

>or writes (from the GPFS end).<br>

><br>

>We also originally had soft as the default option, but saw issues then<br>

>and the docs suggested hard, so we switched and also enabled sync (we<br>

>figured maybe it was NFS client with uncommited writes), but neither have<br>

>resolved the issues entirely. Difficult for me to say if they improved<br>

>the issue though given its sporadic.<br>

><br>

>Appreciate people's suggestions!<br>

><br>

>Thanks<br>

><br>

>Simon<br>

>________________________________________<br>

>From: <a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><br>

>[<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a>] on behalf of Jan-Frode<br>

>Myklebust [<a href="mailto:janfrode@tanso.net" target="_blank">janfrode@tanso.net</a>]<br>

>Sent: 25 April 2017 18:04<br>

>To: gpfsug main discussion list<br>

>Subject: Re: [gpfsug-discuss] NFS issues<br>

><br>

>I *think* I've seen this, and that we then had open TCP connection from<br>

>client to NFS server according to netstat, but these connections were not<br>

>visible from netstat on NFS-server side.<br>

><br>

>Unfortunately I don't remember what the fix was..<br>

><br>

><br>

><br>

>  -jf<br>

><br>

>tir. 25. apr. 2017 kl. 16.06 skrev Simon Thompson (IT Research Support)<br>

><<a href="mailto:S.J.Thompson@bham.ac.uk" target="_blank">S.J.Thompson@bham.ac.uk</a><mailto:<a href="mailto:S.J.Thompson@bham.ac.uk" target="_blank">S.J.Thompson@bham.ac.uk</a>>>:<br>

>Hi,<br>

><br>

>From what I can see, Ganesha uses the Export_Id option in the config file<br>

>(which is managed by CES) for this. I did find some reference in the<br>

>Ganesha devs list that if its not set, then it would read the FSID from<br>

>the GPFS file-system, either way they should surely be consistent across<br>

>all the nodes. The posts I found were from someone with an IBM email<br>

>address, so I guess someone in the IBM teams.<br>

><br>

>I checked a couple of my protocol nodes and they use the same Export_Id<br>

>consistently, though I guess that might not be the same as the FSID value.<br>

><br>

>Perhaps someone from IBM could comment on if FSID is likely to the cause<br>

>of my problems?<br>

><br>

>Thanks<br>

><br>

>Simon<br>

><br>

>On 25/04/2017, 14:51,<br>

>"<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><mailto:<a href="mailto:gpfsug-discuss-bounces@sp" target="_blank">gpfsug-discuss-bounces@sp</a><br>

><a href="http://ectrumscale.org" rel="noreferrer" target="_blank">ectrumscale.org</a>> on behalf of Ouwehand, JJ"<br>

><<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><mailto:<a href="mailto:gpfsug-discuss-bounces@sp" target="_blank">gpfsug-discuss-bounces@sp</a><br>

><a href="http://ectrumscale.org" rel="noreferrer" target="_blank">ectrumscale.org</a>> on behalf of<br>

><a href="mailto:j.ouwehand@vumc.nl" target="_blank">j.ouwehand@vumc.nl</a><mailto:<a href="mailto:j.ouwehand@vumc.nl" target="_blank">j.ouwehand@vumc.nl</a>>> wrote:<br>

><br>

>>Hello,<br>

>><br>

>>At first a short introduction. My name is Jaap Jan Ouwehand, I work at<br>

>>a Dutch hospital "VU Medical Center" in Amsterdam. We make daily use of<br>

>>IBM Spectrum Scale, Spectrum Archive and Spectrum Protect in our<br>

>>critical (office, research and clinical data) business process. We have<br>

>>three large GPFS filesystems for different purposes.<br>

>><br>

>>We also had such a situation with cNFS. A failover (IPtakeover) was<br>

>>technically good, only clients experienced "stale filehandles". We<br>

>>opened a PMR at IBM and after testing, deliver logs, tcpdumps and a few<br>

>>months later, the solution appeared to be in the fsid option.<br>

>><br>

>>An NFS filehandle is built by a combination of fsid and a hash function<br>

>>on the inode. After a failover, the fsid value can be different and the<br>

>>client has a "stale filehandle". To avoid this, the fsid value can be<br>

>>statically specified. See:<br>

>><br>

>><a href="https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.2/com.ibm.spectrum" rel="noreferrer" target="_blank">https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.2/com.ibm.spectrum</a><br>

>>.<br>

>>scale.v4r22.doc/bl1adm_nfslin.htm<br>

>><br>

>>Maybe there is also a value in Ganesha that changes after a failover.<br>

>>Certainly since most sessions will be re-established after a failback.<br>

>>Maybe you see more debug information with tcpdump.<br>

>><br>

>><br>

>>Kind regards,<br>

>><br>

>>Jaap Jan Ouwehand<br>

>>ICT Specialist (Storage & Linux)<br>

>>VUmc - ICT<br>

>>E: <a href="mailto:jj.ouwehand@vumc.nl" target="_blank">jj.ouwehand@vumc.nl</a><mailto:<a href="mailto:jj.ouwehand@vumc.nl" target="_blank">jj.ouwehand@vumc.nl</a>><br>

>>W: <a href="http://www.vumc.com" rel="noreferrer" target="_blank">www.vumc.com</a><<a href="http://www.vumc.com" rel="noreferrer" target="_blank">http://www.vumc.com</a>><br>

>><br>

>><br>

>><br>

>>-----Oorspronkelijk bericht-----<br>

>>Van:<br>

>><a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><mailto:<a href="mailto:gpfsug-discuss-bounces@" target="_blank">gpfsug-discuss-bounces@</a><br>

>><a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a>><br>

>>[mailto:<a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a><mailto:<a href="mailto:gpfsug-discuss-" target="_blank">gpfsug-discuss-</a><br>

>><a href="mailto:bounces@spectrumscale.org" target="_blank">bounces@spectrumscale.org</a>>] Namens Simon Thompson (IT Research Support)<br>

>>Verzonden: dinsdag 25 april 2017 13:21<br>

>>Aan:<br>

>><a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a><mailto:<a href="mailto:gpfsug-discuss@spectrumscale.or" target="_blank">gpfsug-discuss@spectrumscale.or</a><br>

>>g><br>

>>Onderwerp: [gpfsug-discuss] NFS issues<br>

>><br>

>>Hi,<br>

>><br>

>>We have recently started deploying NFS in addition our existing SMB<br>

>>exports on our protocol nodes.<br>

>><br>

>>We use a RR DNS name that points to 4 VIPs for SMB services and<br>

>>failover seems to work fine with SMB clients. We figured we could use<br>

>>the same name and IPs and run Ganesha on the protocol servers, however<br>

>>we are seeing issues with NFS clients when IP failover occurs.<br>

>><br>

>>In normal operation on a client, we might see several mounts from<br>

>>different IPs obviously due to the way the DNS RR is working, but it<br>

>>all works fine.<br>

>><br>

>>In a failover situation, the IP will move to another node and some<br>

>>clients will carry on, others will hang IO to the mount points referred<br>

>>to by the IP which has moved. We can *sometimes* trigger this by<br>

>>manually suspending a CES node, but not always and some clients<br>

>>mounting from the IP moving will be fine, others won't.<br>

>><br>

>>If we resume a node an it fails back, the clients that are hanging will<br>

>>usually recover fine. We can reboot a client prior to failback and it<br>

>>will be fine, stopping and starting the ganesha service on a protocol<br>

>>node will also sometimes resolve the issues.<br>

>><br>

>>So, has anyone seen this sort of issue and any suggestions for how we<br>

>>could either debug more or workaround?<br>

>><br>

>>We are currently running the packages<br>

>>nfs-ganesha-2.3.2-0.ibm32_1.el7.x86_64 (4.2.2-2 release ones).<br>

>><br>

>>At one point we were seeing it a lot, and could track it back to an<br>

>>underlying GPFS network issue that was causing protocol nodes to be<br>

>>expelled occasionally, we resolved that and the issues became less<br>

>>apparent, but maybe we just fixed one failure mode so see it less often.<br>

>><br>

>>On the clients, we use -o sync,hard BTW as in the IBM docs.<br>

>><br>

>>On a client showing the issues, we'll see in dmesg, NFS related<br>

>>messages<br>

>>like:<br>

>>[Wed Apr 12 16:59:53 2017] nfs: server<br>

>><a href="http://MYNFSSERVER.bham.ac.uk" rel="noreferrer" target="_blank">MYNFSSERVER.bham.ac.uk</a><<a href="http://MYNFSSERVER.bham.ac.uk" rel="noreferrer" target="_blank">http://MYNFSSERVER.bham.ac.uk</a>> not responding,<br>

>>timed out<br>

>><br>

>>Which explains the client hang on certain mount points.<br>

>><br>

>>The symptoms feel very much like those logged in this Gluster/ganesha<br>

>>bug:<br>

>><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1354439" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1354439</a><br>

>><br>

>><br>

>>Thanks<br>

>><br>

>>Simon<br>

>><br>

>>_______________________________________________<br>

>>gpfsug-discuss mailing list<br>

>>gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><<a href="http://spectrumscale.org" rel="noreferrer" target="_blank">http://spectrumscale.org</a>><br>

>><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

>>_______________________________________________<br>

>>gpfsug-discuss mailing list<br>

>>gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><<a href="http://spectrumscale.org" rel="noreferrer" target="_blank">http://spectrumscale.org</a>><br>

>><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

><br>

>_______________________________________________<br>

>gpfsug-discuss mailing list<br>

>gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><<a href="http://spectrumscale.org" rel="noreferrer" target="_blank">http://spectrumscale.org</a>><br>

><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

>_______________________________________________<br>

>gpfsug-discuss mailing list<br>

>gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><br>

><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

>_______________________________________________<br>

>gpfsug-discuss mailing list<br>

>gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><br>

><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

<br>

_______________________________________________<br>

gpfsug-discuss mailing list<br>

gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><br>

<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>

</blockquote></div>