[gpfsug-discuss] FW: ESS 3500-C5 : rg has resigned permanently

Jan-Frode Myklebust janfrode at tanso.net
Thu Aug 24 13:50:36 BST 2023


It does sound like "mmvdisk rg change --restart" is the "varyon" command
you're looking for.. but it's not clear why it's failing. I would start by
looking at if there are any lower level issues with your cluster. Are your
nodes healthy on a GPFS-level? "mmnetverify -N all" says network is OK ?
"mmhealth node show -N all" not indicating any issues ?  Check
mmfs.log.latest ?

On Thu, Aug 24, 2023 at 1:41 PM Walter Sklenka <Walter.Sklenka at edv-design.at>
wrote:

>
>
>
>
> Mit freundlichen Grüßen
> *Walter Sklenka*
> *Technical Consultant*
>
>
>
> EDV-Design Informationstechnologie GmbH
> Giefinggasse 6/1/2, A-1210 Wien
> Tel: +43 1 29 22 165-31
> Fax: +43 1 29 22 165-90
> E-Mail: sklenka at edv-design.at
> Internet: www.edv-design.at
>
>
>
> *From:* Walter Sklenka
> *Sent:* Donnerstag, 24. August 2023 12:02
> *To:* 'gpfsug-discuss-request at gpfsug.org' <
> gpfsug-discuss-request at gpfsug.org>
> *Subject:* FW: ESS 3500-C5 : rg has resigned permanently
>
>
>
> Hi !
>
> Does someone eventually have experience with ESS 3500 ( no hybrid config,
> only NLSAS with 5 enclosures )
>
>
>
> We have issues with a shared recoverygroup. After creating it we made a
> test of setting only one node active (mybe not an optimal idea)
>
> But since then the recoverygroup is down
>
> We have created a PMR but do not get any response until now.
>
>
>
> The rg has no vdisks of any filesystem
>
> [gpfsadmin at hgess02-m ~]$ ^C
> [gpfsadmin at hgess02-m ~]$ sudo mmvdisk rg change --rg
> ess3500_hgess02_n1_hs_hgess02_n2_hs --restart
> mmvdisk:
> mmvdisk:
> mmvdisk: Unable to reset server list for recovery group
> 'ess3500_hgess02_n1_hs_hgess02_n2_hs'.
> mmvdisk: Command failed. Examine previous error messages to determine
> cause.
>
>
>
> We also tried
>
> 2023-08-21_16:57:26.174+0200: [I] Command: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid
> 2023-08-21_16:57:26.201+0200: [I] Recovery group
> ess3500_hgess02_n1_hs_hgess02_n2_hs has resigned permanently
> 2023-08-21_16:57:26.201+0200: [E] Command: err 2: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid
> 2023-08-21_16:57:26.201+0200: Specified entity, such as a disk or file
> system, does not exist.
> 2023-08-21_16:57:26.207+0200: [I] Command: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid.
> 2023-08-21_16:57:26.207+0200: [E] Command: err 212: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid
> 2023-08-21_16:57:26.207+0200: The current file system manager failed and
> no new manager will be appointed. This may cause nodes mounting the file
> system to experience mount failures.
> 2023-08-21_16:57:26.213+0200: [I] Command: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid
> 2023-08-21_16:57:26.213+0200: [E] Command: err 212: tsrecgroupserver
> ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid
> 2023-08-21_16:57:26.213+0200: The current file system manager failed and
> no new manager will be appointed. This may cause nodes mounting the file
> system to experience mount failures.
>
>
>
>
>
> For us it is crucial to know what we can do if theis happens again  ( it
> has no vdisks yet so it is not critical ).
>
>
>
> Do you know: is there a non documented way to “vary on”, or activate a
> recoverygroup again?
>
> The doc :
>
>
> https://www.ibm.com/docs/en/ess/6.1.6_lts?topic=rgi-recovery-group-issues-shared-recovery-groups-in-ess
>
> tells to mmshutdown and mmstartup, but the RGCM does say nothing
>
> When trying to execute any vdisk command it only says “rg down”, no idea
> how we could recover from that without deleting the rg ( I hope it will
> never happen, when we have vdisks on it
>
>
>
>
>
>
>
> Have a nice day
>
> Walter
>
>
>
>
>
>
>
>
>
> Mit freundlichen Grüßen
> *Walter Sklenka*
> *Technical Consultant*
>
>
>
> EDV-Design Informationstechnologie GmbH
> Giefinggasse 6/1/2, A-1210 Wien
> Tel: +43 1 29 22 165-31
> Fax: +43 1 29 22 165-90
> E-Mail: sklenka at edv-design.at
> Internet: www.edv-design.at
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230824/d4060ce7/attachment-0001.htm>


More information about the gpfsug-discuss mailing list