[gpfsug-discuss] storage-based replication for Spectrum Scale

Harold Morales hmorales at optimizeit.co
Sat Jan 27 09:11:44 GMT 2018


Thanks  for your insights.

Alex, we did as you mentioned but after using mmimportfs there are a lot of
errors on all commands having  to do with filesystems:

GPFS: 6027-419 Failed to read a file system descriptor.
There is an input or output error

that occurs for:

mmlsfs
mmlsdisk
mmdf

obviously, fs won't mount.




2018-01-27 0:02 GMT-05:00 Alex Levin <alevin at gmail.com>:

> Steve,
>   I've read that "mmfsctl suspend or suspend-write"  should be executed,
> but in real life it is  impossible for DR scenario.
>
> We tested both cases,
>  the graceful one when the failover to another site is planned,
> applications stopped and   i/o suspended
> and the case when there was no advanced notice about the disaster in the
> primary site.
>
> Both worked and for the second case various loads were simulated including
> heavy writes and reads/writes.
> In disaster model as expected some data were lost (due to incomplete
> writes, replication latency ... ),
>  but mmfsck was always able to repair and the applications databases
> located on the affected filesystem were in an acceptable state.
>
>
> it is possible that we were just lucky and the test case didn't cover all
> possible scenarios.
>
> Harold,
>  In our case, it is Linux, not AIX, but it shouldn't matter
>  And our DR  cluster is fully configured ( different IPs, hostnames and
> cluster name )  and running without filesystems at all or with
> a differently named filesystem.
>
>  Before running mmimport , make sure that all expected LUNs are visible
> and writable.
>  You can verify if the device is correct by reading first blocks of the
> device ( for example dd if=/dev/NSD_LUN_device bs=1M count=16 | strings )
>
> not sure where you are " moved the mmsdrfs"
> you don't need to move/modify  mmsdrfs file on the target ( disaster
> recovery ) cluster
>
> Just copy the one from primary to /tmp or /var/tmp and  try to run mmimportfs
> fs_name -i copy_of_mmsdrfs /tmp/mmsdrfs
>
>
>
> Glen, as Harold has no IP connectivity between the clusters  "mmfsctl
> syncFSconfig" is not an option ...
>
> Thanks
> --Alex
>
>
>
>
> On Fri, Jan 26, 2018 at 4:04 PM, Steve Xiao <sxiao at us.ibm.com> wrote:
>
>> When using this method of replication, you need to either issue "mmfsctl
>> suspend or suspend-write" command before replication or setup a single
>> consistency group for all LUNs.    This is needed to ensure replica contain
>> a consistent copy of GPFS data.
>>
>> Steve Y. Xiao
>>
>> gpfsug-discuss-bounces at spectrumscale.org wrote on 01/26/2018 03:21:23 PM:
>>
>> > From: gpfsug-discuss-request at spectrumscale.org
>> > To: gpfsug-discuss at spectrumscale.org
>> > Date: 01/26/2018 03:21 PM
>> > Subject: gpfsug-discuss Digest, Vol 72, Issue 69
>> > Sent by: gpfsug-discuss-bounces at spectrumscale.org
>> >
>> > Send gpfsug-discuss mailing list submissions to
>> >    gpfsug-discuss at spectrumscale.org
>> >
>> > To subscribe or unsubscribe via the World Wide Web, visit
>> >    https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> > or, via email, send a message with subject or body 'help' to
>> >    gpfsug-discuss-request at spectrumscale.org
>> >
>> > You can reach the person managing the list at
>> >    gpfsug-discuss-owner at spectrumscale.org
>> >
>> > When replying, please edit your Subject line so it is more specific
>> > than "Re: Contents of gpfsug-discuss digest..."
>> >
>> >
>> > Today's Topics:
>> >
>> >    1. Re: storage-based replication for Spectrum Scale (Harold Morales)
>> >    2. Re: storage-based replication for Spectrum Scale (Glen Corneau)
>> >
>> >
>> > ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Fri, 26 Jan 2018 13:29:09 -0500
>> > From: Harold Morales <hmorales at optimizeit.co>
>> > To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> > Subject: Re: [gpfsug-discuss] storage-based replication for Spectrum
>> >    Scale
>> > Message-ID:
>> >    <CAGQnKWaAJAuFxYT2PzjMW47vuOsMg4evUAT8+SV49rrVkkQLBg at mail.gmail.com>
>> > Content-Type: text/plain; charset="utf-8"
>>
>> >
>> > Hi Alex, This set up seems close to what I am trying to achieve.
>> >
>> > With regards to this kind of replication: any prereqs need to be met in
>> the
>> > target environment for this to work? for example, should disk devices
>> > naming on AIX be the same as in the source environment? when importing
>> the
>> > mmsdrfs file, how is Scale going to know which disks should it assign to
>> > the cluster? by their hdisk name alone?
>> >
>> > Thanks again,
>> >
>> >
>> >
>> > 2018-01-24 2:30 GMT-05:00 Alex Levin <alevin at gmail.com>:
>> >
>> > > Hi,
>> > >
>> > > We are using a  similar type of replication.
>> > > I assume the site B is the cold site prepared for DR
>> > >
>> > > The storage layer is EMC VMAX and the LUNs are replicated with SRDF.
>> > > All LUNs ( NSDs ) of the gpfs filesystem are in the same VMAX
>> replication
>> > > group to ensure consistency.
>> > >
>> > > The cluster name, IP addresses ,  hostnames of the cluster nodes are
>> > > different on another site - it can be a pre-configured cluster without
>> > > gpfs filesystems or with another filesystem.
>> > > Same names and addresses shouldn't be a problem.
>> > >
>> > > Additionally to the replicated LUNs/NSDs you need to deliver copy
>> > > of /var/mmfs/gen/mmsdrfs  file from A to B site.
>> > > There is no need to replicate it in real-time, only after the change
>> of
>> > > the cluster configuration.
>> > >
>> > > To activate  site B - present replicated LUNs to the nodes in the DR
>> > > cluster and run  mmimportfs as "mmimportfs  fs_name -i
>> copy_of_mmsdrfs"
>> > >
>> > > Tested  with multiples LUNs and filesystems on various workloads -
>> seems
>> > > to be working
>> > >
>> > > --Alex
>> > >
>> > >
>> > > On Wed, Jan 24, 2018 at 1:33 AM, Harold Morales <
>> hmorales at optimizeit.co>
>> > > wrote:
>> > >
>> > >> Thanks for answering.
>> > >>
>> > >> Essentially, the idea being explored is to replicate LUNs between
>> > >> identical storage hardware (HP 3PAR volumesrein) on both sites.
>> There is an
>> > >> IP connection between storage boxes but not between servers on both
>> sites,
>> > >> there is a dark fiber connecting both sites. Here they dont want to
>> explore
>> > >> the idea of a scaled-based.
>> > >>
>> > >>
>> > >> _______________________________________________
>> > >> gpfsug-discuss mailing list
>> > >> gpfsug-discuss at spectrumscale.org
>> > >> https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> > >>
>> > >>
>> > >
>> > > _______________________________________________
>> > > gpfsug-discuss mailing list
>> > > gpfsug-discuss at spectrumscale.org
>> > > https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> > >
>> > >
>> > -------------- next part --------------
>> > An HTML attachment was scrubbed...
>> > URL: <https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments
>> _20180126_1503995b_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=SKdMmQae8uzHNWZq3vuRTp5UVwYFeeusLAxtbaposX0&e=>
>> >
>> > ------------------------------
>> >
>> > Message: 2
>> > Date: Fri, 26 Jan 2018 14:21:15 -0600
>> > From: "Glen Corneau" <gcorneau at us.ibm.com>
>> > To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> > Subject: Re: [gpfsug-discuss] storage-based replication for Spectrum
>> >    Scale
>> > Message-ID:
>> >    <OFEAC48C60.2890CAEF-ON86258221.006F3F43-86258221.
>> > 006FCEEE at notes.na.collabserv.com>
>> >
>> > Content-Type: text/plain; charset="us-ascii"
>> >
>> > Scale will walk across all discovered disks upon start time and attempt
>> to
>> > read the NSD identifiers from the disks.  Once it finds them, it makes
>> a
>> > local map file that correlates the NSD id and the hdiskX identifier.
>>  The
>> > names do not have to be the same as either the source cluster or even
>> from
>> > node-to-node.
>> >
>> > The main thing to keep in mind is to keep the file system definitions
>> in
>> > sync between the source and destination clusters.  The "syncFSconfig"
>> user
>> > exit is the best way to do it because it's automatic.  You generally
>> > shouldn't be shuffling the mmsdrfs file between sites, that's what the
>> > "mmfsctl syncFSconfig" does for you, on a per-file system basis.
>> >
>> > GPFS+AIX customers have been using this kind of storage replication for
>> > over 10 years, it's business as usual.
>> >
>> > ------------------
>> > Glen Corneau
>> > Power Systems
>> > Washington Systems Center
>> > gcorneau at us.ibm.com
>> >
>> >
>> >
>> >
>> >
>> > From:   Harold Morales <hmorales at optimizeit.co>
>> > To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> > Date:   01/26/2018 12:30 PM
>> > Subject:        Re: [gpfsug-discuss] storage-based replication for
>> > Spectrum Scale
>> > Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>> >
>> >
>> >
>> > Hi Alex, This set up seems close to what I am trying to achieve.
>> >
>> > With regards to this kind of replication: any prereqs need to be met in
>> > the target environment for this to work? for example, should disk
>> devices
>> > naming on AIX be the same as in the source environment? when importing
>> the
>> > mmsdrfs file, how is Scale going to know which disks should it assign
>> to
>> > the cluster? by their hdisk name alone?
>> >
>> > Thanks again,
>> >
>> >
>> >
>> > 2018-01-24 2:30 GMT-05:00 Alex Levin <alevin at gmail.com>:
>> > Hi,
>> >
>> > We are using a  similar type of replication.
>> > I assume the site B is the cold site prepared for DR
>> >
>> > The storage layer is EMC VMAX and the LUNs are replicated with SRDF.
>> > All LUNs ( NSDs ) of the gpfs filesystem are in the same VMAX
>> replication
>> > group to ensure consistency.
>> >
>> > The cluster name, IP addresses ,  hostnames of the cluster nodes are
>> > different on another site - it can be a pre-configured cluster without
>> > gpfs filesystems or with another filesystem.
>> > Same names and addresses shouldn't be a problem.
>> >
>> > Additionally to the replicated LUNs/NSDs you need to deliver copy
>> > of /var/mmfs/gen/mmsdrfs  file from A to B site.
>> > There is no need to replicate it in real-time, only after the change of
>> > the cluster configuration.
>> >
>> > To activate  site B - present replicated LUNs to the nodes in the DR
>> > cluster and run  mmimportfs as "mmimportfs  fs_name -i copy_of_mmsdrfs"
>> >
>> > Tested  with multiples LUNs and filesystems on various workloads -
>> seems
>> > to be working
>> >
>> > --Alex
>> >
>> >
>> >
>> > On Wed, Jan 24, 2018 at 1:33 AM, Harold Morales <hmorales at optimizeit.co>
>>
>> > wrote:
>> > Thanks for answering.
>> >
>> > Essentially, the idea being explored is to replicate LUNs between
>> > identical storage hardware (HP 3PAR volumesrein) on both sites. There
>> is
>> > an IP connection between storage boxes but not between servers on both
>> > sites, there is a dark fiber connecting both sites. Here they dont want
>> to
>> > explore the idea of a scaled-based.
>> >
>> >
>> > _______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> >
>> >
>> >
>> > _______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> >
>> > _______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=d-
>> > vphLEe_UlGazP6RdYAyyAA3Qv5S9IRVNuO1i9vjJc&m=VbfWaftYSVjx8fMb
>> 2vHGBi6XUhDJOsKf_dKOX3J8s1A&s=p2mGvPrlPLO1oyEh-
>> > GeVJiVS49opBwwCFs-FKQrQ7rc&e=
>> >
>> >
>> >
>> >
>> >
>> > -------------- next part --------------
>> > An HTML attachment was scrubbed...
>> > URL: <https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments
>> _20180126_e291af63_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=bYnf-7v0CxYUkGth-QaVeUQdIlG8f1Gro-hwOxok7Qw&e=>
>> > -------------- next part --------------
>> > A non-text attachment was scrubbed...
>> > Name: not available
>> > Type: image/jpeg
>> > Size: 26117 bytes
>> > Desc: not available
>> > URL: <https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments
>> _20180126_e291af63_attachment.jpe&d=DwICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=jYdnqhQBlnpf58oxunzBcTs9XdcbeOtLDQdgnASidDA&e=>
>> >
>> > ------------------------------
>> >
>> > _______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > https://urldefense.proofpoint.com/v2/url?
>> > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=Dw
>> ICAg&c=jf_iaSHvJObTbx-
>> > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=ddkAIwRPQmQKBLLh6nBzhdt-
>> > OKoJwOucQ8oaQet8mkE&s=Emkr3VzCYTecA6E61hAk1AeB6ka34dGAYip6fGaKuwU&e=
>> >
>> >
>> > End of gpfsug-discuss Digest, Vol 72, Issue 69
>> > **********************************************
>> >
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180127/9e4f4f3d/attachment-0002.htm>


More information about the gpfsug-discuss mailing list