[gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

Ray Coetzee coetzee.ray at gmail.com
Mon Apr 23 00:23:55 BST 2018


Hi Jan-Frode
We've been told the same regarding mounts using UDP.
Our exports are already explicitly configured for TCP and the client's
fstab's set to use TCP.
It would be infuriating if the clients are trying UDP first irrespective of
the mount options configured.

Why the problem started specifically last week for both of us is
interesting.

Kind regards

Ray Coetzee
Mob: +44 759 704 7060

Skype: ray.coetzee

Email: coetzee.ray at gmail.com


On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust <janfrode at tanso.net>
wrote:

>
> Yes, I've been struggelig with something similiar this week. Ganesha dying
> with SIGABRT -- nothing else logged. After catching a few coredumps, it has
> been identified as a problem with some udp-communication during mounts from
> solaris clients. Disabling udp as transport on the shares serverside didn't
> help. It was suggested to use "mount -o tcp" or whatever the solaris
> version of this is -- but we haven't tested this. So far the downgrade to
> v2.3.2 has been our workaround.
>
> PMR:  48669,080,678
>
>
>   -jf
>
>
> On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee <coetzee.ray at gmail.com>
> wrote:
>
>> Good evening all
>>
>> I'm working with IBM on a PMR where ganesha is segfaulting or causing
>> kernel panics on one group of CES nodes.
>>
>> We have 12 identical CES nodes split into two groups of 6 nodes each &
>> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was
>> released.
>>
>> Only one group started having issues Monday morning where ganesha would
>> segfault and the mounts would move over to the remaining nodes.
>> The remaining nodes then start to fall over like dominos within minutes
>> or hours to the point that all CES nodes are "failed" according to
>> "mmces node list" and the VIP's are unassigned.
>>
>> Recovering the nodes are extremely finicky and works for a few minutes or
>> hours before segfaulting again.
>> Most times a complete stop of Ganesha on all nodes & then only starting
>> it on two random nodes allow mounts to recover for a while.
>>
>> None of the following has helped:
>> A reboot of all nodes.
>> Refresh CCR config file with mmsdrrestore
>> Remove/add CES from nodes.
>> Reinstall GPFS & protocol rpms
>> Update to 5.0.0-2
>> Fresh reinstall of a node
>> Network checks out with no dropped packets on either data or export
>> networks.
>>
>> The only temporary fix so far has been to downrev ganesha to 2.3.2 from
>> 2.5.3 on the affected nodes.
>>
>> While waiting for IBM development, has anyone seen something similar
>> maybe?
>>
>> Kind regards
>>
>> Ray Coetzee
>>
>>
>>
>> On Sat, Apr 21, 2018 at 12:00 PM, <gpfsug-discuss-request at spectr
>> umscale.org> wrote:
>>
>>> Send gpfsug-discuss mailing list submissions to
>>>         gpfsug-discuss at spectrumscale.org
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> or, via email, send a message with subject or body 'help' to
>>>         gpfsug-discuss-request at spectrumscale.org
>>>
>>> You can reach the person managing the list at
>>>         gpfsug-discuss-owner at spectrumscale.org
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of gpfsug-discuss digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>    1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar)
>>>    2. Re: UK Meeting - tooling Spectrum Scale
>>>       (Simon Thompson (IT Research Support))
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Fri, 20 Apr 2018 14:01:55 +0000
>>> From: "Grunenberg, Renar" <Renar.Grunenberg at huk-coburg.de>
>>> To: "'gpfsug-discuss at spectrumscale.org'"
>>>         <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>> Message-ID: <fb4c0ca7ece5462d96948e562803e77e at SMXRF105.msg.hukrf.de>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> Hallo Simon,
>>> are there any reason why the link of the presentation from Yong ZY
>>> Zheng(Cognitive, ML, Hortonworks) is not linked.
>>>
>>> Renar Grunenberg
>>> Abteilung Informatik ? Betrieb
>>>
>>> HUK-COBURG
>>> Bahnhofsplatz
>>> 96444 Coburg
>>> Telefon:        09561 96-44110
>>> Telefax:        09561 96-44104
>>> E-Mail: Renar.Grunenberg at huk-coburg.de
>>> Internet:       www.huk.de
>>> ________________________________
>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter
>>> Deutschlands a. G. in Coburg
>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans
>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas.
>>> ________________________________
>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte
>>> Informationen.
>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht
>>> irrt?mlich erhalten haben,
>>> informieren Sie bitte sofort den Absender und vernichten Sie diese
>>> Nachricht.
>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
>>> ist nicht gestattet.
>>>
>>> This information may contain confidential and/or privileged information.
>>> If you are not the intended recipient (or have received this information
>>> in error) please notify the
>>> sender immediately and destroy this information.
>>> Any unauthorized copying, disclosure or distribution of the material in
>>> this information is strictly forbidden.
>>> ________________________________
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/2018
>>> 0420/91e3d84d/attachment-0001.html>
>>>
>>> ------------------------------
>>>
>>> Message: 2
>>> Date: Fri, 20 Apr 2018 14:12:11 +0000
>>> From: "Simon Thompson (IT Research Support)" <S.J.Thompson at bham.ac.uk>
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> Sorry, it was a typo from my side.
>>>
>>> The talks that are missing we are chasing for copies of the slides that
>>> we can release.
>>>
>>> Simon
>>>
>>> From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "
>>> Renar.Grunenberg at huk-coburg.de" <Renar.Grunenberg at huk-coburg.de>
>>> Reply-To: "gpfsug-discuss at spectrumscale.org" <
>>> gpfsug-discuss at spectrumscale.org>
>>> Date: Friday, 20 April 2018 at 15:02
>>> To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org
>>> >
>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>>
>>> Hallo Simon,
>>> are there any reason why the link of the presentation from Yong ZY
>>> Zheng(Cognitive, ML, Hortonworks) is not linked.
>>>
>>> Renar Grunenberg
>>> Abteilung Informatik ? Betrieb
>>>
>>> HUK-COBURG
>>> Bahnhofsplatz
>>> 96444 Coburg
>>> Telefon:
>>>
>>> 09561 96-44110
>>>
>>> Telefax:
>>>
>>> 09561 96-44104
>>>
>>> E-Mail:
>>>
>>> Renar.Grunenberg at huk-coburg.de
>>>
>>> Internet:
>>>
>>> www.huk.de
>>>
>>> ________________________________
>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter
>>> Deutschlands a. G. in Coburg
>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans
>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas.
>>> ________________________________
>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte
>>> Informationen.
>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht
>>> irrt?mlich erhalten haben,
>>> informieren Sie bitte sofort den Absender und vernichten Sie diese
>>> Nachricht.
>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
>>> ist nicht gestattet.
>>>
>>> This information may contain confidential and/or privileged information.
>>> If you are not the intended recipient (or have received this information
>>> in error) please notify the
>>> sender immediately and destroy this information.
>>> Any unauthorized copying, disclosure or distribution of the material in
>>> this information is strictly forbidden.
>>> ________________________________
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/2018
>>> 0420/0b8e9ffa/attachment-0001.html>
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>> End of gpfsug-discuss Digest, Vol 75, Issue 34
>>> **********************************************
>>>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180423/355d39ec/attachment-0002.htm>


More information about the gpfsug-discuss mailing list