[gpfsug-discuss] add local nsd back to cluster?

Sat Jul 30 01:30:57 BST 2022

Starting GPFS 5.1.4, you can use the CCR archive to restore the local node (the node that is issuing the mmsdrrestore command) beside restoring the entire cluster.

Prior to GPFS5.1.4, as the error message reviewed, you can only use the CCR archive to restore the entire cluster.  GPFS must be down any node that is being restored. 
If is a good node in the cluster, use the -p option

-p NodeName
         Specifies the node from which to obtain a valid GPFS
         configuration file. The node must be either the primary
         configuration server or a node that has a valid backup
         copy of the mmsdrfs file. If this parameter is not
         specified, the command uses the configuration file on
         the node from which the command is issued.

Thanks,
Tru.

On 7/29/22, 12:51 PM, "gpfsug-discuss on behalf of gpfsug-discuss-request at gpfsug.org" <gpfsug-discuss-bounces at gpfsug.org on behalf of gpfsug-discuss-request at gpfsug.org> wrote:

    Send gpfsug-discuss mailing list submissions to
    	gpfsug-discuss at gpfsug.org

    To subscribe or unsubscribe via the World Wide Web, visit
    	http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 
    or, via email, send a message with subject or body 'help' to
    	gpfsug-discuss-request at gpfsug.org

    You can reach the person managing the list at
    	gpfsug-discuss-owner at gpfsug.org

    When replying, please edit your Subject line so it is more specific
    than "Re: Contents of gpfsug-discuss digest..."

    Today's Topics:

       1. Re: add local nsd back to cluster? (shao feng)
       2. Re: add local nsd back to cluster? (Stephen Ulmer)

    ----------------------------------------------------------------------

    Message: 1
    Date: Fri, 29 Jul 2022 23:54:24 +0800
    From: shao feng <shaof777 at gmail.com>
    To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
    Subject: Re: [gpfsug-discuss] add local nsd back to cluster?
    Message-ID:
    	<CANiV0ORjKzbyKqLvHgQEkPKo9Y--ptPRxfPjXpJBvkQmukqCgA at mail.gmail.com>
    Content-Type: text/plain; charset="utf-8"

    Thanks Olaf

    I've setup the mmsdr backup as
    https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=exits-mmsdrbackup-user-exit,
    since my cluster is CCR enabled, it generate a CCR backup file,
    but when trying to restore from this file, it require quorum nodes to
    shutdown? Is it possible to restore without touching quorum nodes?

    [root at tofail ~]# mmsdrrestore -F
    CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz
    Restoring a CCR backup archive is a cluster-wide operation.
    The -a flag is required.
    mmsdrrestore: Command failed. Examine previous error messages to determine
    cause.

    [root at tofail ~]# mmsdrrestore -F
    CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz -a
    Restoring CCR backup
    Verifying that GPFS is inactive on quorum nodes
    mmsdrrestore: GPFS is still active on myquorum
    mmsdrrestore: Unexpected error from mmsdrrestore: CCR restore failed.
    Return code: 192
    mmsdrrestore: Command failed. Examine previous error messages to determine
    cause.

    On Thu, Jul 28, 2022 at 3:14 PM Olaf Weiser <olaf.weiser at de.ibm.com> wrote:

    >
    >
    > Hi -
    > assuming, you'll run it  withou ECE  ?!? ... just with replication on the
    > file system level
    > ba aware, every time a node goes offline, you 'll have to restart the
    > disks in your filesystem .. This causes a complete scan of the meta data to
    > detect files with missing updates / replication
    >
    >
    > apart from that to your Q :
    > you may consider to backup mmsdr
    > additionally, take a look to   mmsdrrestore, in case you want to restore a
    > nodes's SDR configuration
    >
    > quick and dirty..  save the content  of  /var/mmfs  may also help you
    >
    > during the node is "gone".. of course.. the disk is down , after restore
    > of SDR / node's config .. it should be able to start ..
    > the rest runs as usual
    >
    >
    >
    > ------------------------------
    > *Von:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von
    > shao feng <shaof777 at gmail.com>
    > *Gesendet:* Donnerstag, 28. Juli 2022 09:02
    > *An:* gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
    > *Betreff:* [EXTERNAL] [gpfsug-discuss] add local nsd back to cluster?
    >
    > Hi all, I am planning to implement  a cluster with a bunch of old x86
    > machines, the disks are not connected to nodes via the SAN network, instead
    > each x86 machine has some local attached disks. The question is regarding
    > node failure, for example
    > ZjQcmQRYFpfptBannerStart
    > This Message Is From an External Sender
    > This message came from outside your organization.
    >
    > ZjQcmQRYFpfptBannerEnd
    > Hi all,
    >
    > I am planning to implement  a cluster with a bunch of old x86 machines,
    > the disks are not connected to nodes via the SAN network, instead each x86
    > machine has some local attached disks.
    > The question is regarding node failure, for example only the operating
    > system disk fails and the nsd disks are good. In that case I plan to
    > replace the failing OS disk with a new one and install the OS on it and
    > re-attach these nsd disks to that node, my question is: will this work? how
    > can I add a nsd back to the cluster without restoring data from other
    > replicas since the data/metadata is actually not corrupted on nsd.
    >
    > Best regards,
    > _______________________________________________
    > gpfsug-discuss mailing list
    > gpfsug-discuss at gpfsug.org
    > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 
    >
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220729/1fc0e167/attachment-0001.htm >

    ------------------------------

    Message: 2
    Date: Fri, 29 Jul 2022 12:48:44 -0400
    From: Stephen Ulmer <ulmer at ulmer.org>
    To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
    Subject: Re: [gpfsug-discuss] add local nsd back to cluster?
    Message-ID: <1DEB036E-AA3A-4498-A5B9-B66078EC87A9 at ulmer.org>
    Content-Type: text/plain; charset="utf-8"

    If there are cluster nodes up, restore from the running nodes instead of the file. I think it?s -p, but look at the manual page.

    -- 
    Stephen Ulmer

    Sent from a mobile device; please excuse auto-correct silliness.

    > On Jul 29, 2022, at 11:20 AM, shao feng <shaof777 at gmail.com> wrote:
    > 
    > ?
    > Thanks Olaf
    > 
    > I've setup the mmsdr backup as https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=exits-mmsdrbackup-user-exit, since my cluster is CCR enabled, it generate a CCR backup file,
    > but when trying to restore from this file, it require quorum nodes to shutdown? Is it possible to restore without touching quorum nodes?
    > 
    > [root at tofail ~]# mmsdrrestore -F CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz
    > Restoring a CCR backup archive is a cluster-wide operation.
    > The -a flag is required.
    > mmsdrrestore: Command failed. Examine previous error messages to determine cause.
    > 
    > [root at tofail ~]# mmsdrrestore -F CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz -a
    > Restoring CCR backup
    > Verifying that GPFS is inactive on quorum nodes
    > mmsdrrestore: GPFS is still active on myquorum
    > mmsdrrestore: Unexpected error from mmsdrrestore: CCR restore failed.  Return code: 192
    > mmsdrrestore: Command failed. Examine previous error messages to determine cause.
    > 
    > 
    >> On Thu, Jul 28, 2022 at 3:14 PM Olaf Weiser <olaf.weiser at de.ibm.com> wrote:
    >> 
    >> 
    >> Hi - 
    >> assuming, you'll run it  withou ECE  ?!? ... just with replication on the file system level 
    >> ba aware, every time a node goes offline, you 'll have to restart the disks in your filesystem .. This causes a complete scan of the meta data to detect files with missing updates / replication
    >> 
    >> 
    >> apart from that to your Q :
    >> you may consider to backup mmsdr 
    >> additionally, take a look to   mmsdrrestore, in case you want to restore a nodes's SDR configuration 
    >> 
    >> quick and dirty..  save the content  of  /var/mmfs  may also help you 
    >> 
    >> during the node is "gone".. of course.. the disk is down , after restore of SDR / node's config .. it should be able to start .. 
    >> the rest runs as usual 
    >> 
    >> 
    >> 
    >> Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von shao feng <shaof777 at gmail.com>
    >> Gesendet: Donnerstag, 28. Juli 2022 09:02
    >> An: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
    >> Betreff: [EXTERNAL] [gpfsug-discuss] add local nsd back to cluster?
    >>  
    >> This Message Is From an External Sender
    >> This message came from outside your organization.
    >>  
    >> Hi all,
    >> 
    >> I am planning to implement  a cluster with a bunch of old x86 machines, the disks are not connected to nodes via the SAN network, instead each x86 machine has some local attached disks.
    >> The question is regarding node failure, for example only the operating system disk fails and the nsd disks are good. In that case I plan to replace the failing OS disk with a new one and install the OS on it and re-attach these nsd disks to that node, my question is: will this work? how can I add a nsd back to the cluster without restoring data from other replicas since the data/metadata is actually not corrupted on nsd.
    >> 
    >> Best regards,
    >> _______________________________________________
    >> gpfsug-discuss mailing list
    >> gpfsug-discuss at gpfsug.org
    >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 
    > _______________________________________________
    > gpfsug-discuss mailing list
    > gpfsug-discuss at gpfsug.org
    > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220729/1c773ee2/attachment.htm >

    ------------------------------

    Subject: Digest Footer

    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at gpfsug.org
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 

    ------------------------------

    End of gpfsug-discuss Digest, Vol 126, Issue 21
    ***********************************************