[gpfsug-discuss] NDS in Two Site scenario

Thu Jul 21 14:02:02 BST 2016

The annoying answer is "it depends”.

I ran a system with all of the NSDs being visible to all of the NSDs on both sites and that worked well. 

However there are lots of questions to answer:

	Where are the clients going to live? 
	Will you have clients in both sites or just one? 
	Is it dual site working or just DR?
	Where will the majority of the writes happen? 
	Would you rather that traffic went over the SAN or the IP link?
	Do you have a SAN link between the 2 sites?
	Which is faster, the SAN link between sites or the IP link between the sites? 
	Are they the same link? Are they both redundant, which is the most stable?

The answers to these questions would drive the design of the gpfs filesystem.

For example if there are clients on only on site A , you might then make the NSD servers on site A the primary NSD servers for all of the NSDs on site A and site B - then you send the replica blocks over the SAN.

You also could make a matrix of the failure scenarios you want to protect against, the consequences of the failure and the likelihood of failure etc. That will also inform the design.

Does that help?

Vic

> On 21 Jul 2016, at 1:45 pm, Mark.Bush at siriuscom.com wrote:
> 
> This is where my confusion sits.  So if I have two sites, and two NDS Nodes per site with 1 NSD (to keep it simple), do I just present the physical LUN in Site1 to Site1 NDS Nodes and physical LUN in Site2 to Site2 NSD Nodes?  Or is it that I present physical LUN in Site1 to all 4 NDS Nodes and the same at Site2?  (Assuming SAN and not direct attached in this case).  I know I’m being persistent but this for some reason confuses me. 
>  
> Site1
> NSD Node1
>                                 ---NSD1 ---Physical LUN1 from SAN1
> NSD Node2
>  
>  
> Site2
> NSD Node3          
> ---NSD2 –Physical LUN2 from SAN2
> NSD Node4
>  
>  
> Or 
>  
>  
> Site1
> NSD Node1
>                                 ----NSD1 –Physical LUN1 from SAN1
>                                ----NSD2 –Physical LUN2 from SAN2
> NSD Node2
>  
> Site 2
> NSD Node3          
>                                 ---NSD2 – Physical LUN2 from SAN2
>                                 ---NSD1  --Physical LUN1 from SAN1
> NSD Node4
>  
>  
> Site 3
> Node5 Quorum
>  
>  
>  
> From: <gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Ken Hill <kenh at us.ibm.com <mailto:kenh at us.ibm.com>>
> Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Wednesday, July 20, 2016 at 7:02 PM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: Re: [gpfsug-discuss] NDS in Two Site scenario
>  
> Yes - it is a cluster.
> 
> The sites should NOT be further than a MAN - or Campus network. If you're looking to do this over a large distance - it would be best to choose another GPFS solution (Multi-Cluster, AFM, etc).
> 
> Regards,
> 
> Ken Hill
> Technical Sales Specialist | Software Defined Solution Sales
> IBM Systems
> Phone:1-540-207-7270
> E-mail: kenh at us.ibm.com <mailto:kenh at us.ibm.com>	
> <image001.png> <http://www.ibm.com/us-en/>  <image002.png> <http://www-03.ibm.com/systems/platformcomputing/products/lsf/> <image003.png> <http://www-03.ibm.com/systems/platformcomputing/products/high-performance-services/index.html>  <image004.png> <http://www-03.ibm.com/systems/platformcomputing/products/symphony/index.html> <image005.png> <http://www-03.ibm.com/systems/storage/spectrum/>  <image006.png> <http://www-01.ibm.com/software/tivoli/csi/cloud-storage/> <image007.png> <http://www-01.ibm.com/software/tivoli/csi/backup-recovery/>  <image008.png> <http://www-03.ibm.com/systems/storage/tape/ltfs/index.html> <image009.png> <http://www-03.ibm.com/systems/storage/spectrum/>  <image010.png> <http://www-03.ibm.com/systems/storage/spectrum/scale/> <image011.png> <https://www.ibm.com/marketplace/cloud/object-storage/us/en-us> 
> 
> 2300 Dulles Station Blvd
> Herndon, VA 20171-6133
> United States
> 
> 
> 
> 
> 
> 
> From:        "Mark.Bush at siriuscom.com <mailto:Mark.Bush at siriuscom.com>" <Mark.Bush at siriuscom.com <mailto:Mark.Bush at siriuscom.com>>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Date:        07/20/2016 07:33 PM
> Subject:        Re: [gpfsug-discuss] NDS in Two Site scenario
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> 
> 
> So in this scenario Ken, can server3 see any disks in site1?  
>  
> From: <gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Ken Hill <kenh at us.ibm.com <mailto:kenh at us.ibm.com>>
> Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Wednesday, July 20, 2016 at 4:15 PM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: Re: [gpfsug-discuss] NDS in Two Site scenario
>  
> 
>                                  Site1                                            Site2
>                                  Server1 (quorum 1)                      Server3 (quorum 2)
>                                  Server2                                       Server4
> 
> 
> 
> 
>                                  SiteX 
>                                  Server5 (quorum 3)
> 
> 
> 
> 
> You need to set up another site (or server) that is at least power isolated (if not completely infrastructure isolated) from Site1 or Site2. You would then set up a quorum node at that site | location. This insures you can still access your data even if one of your sites go down.
> 
> You can further isolate failure by increasing quorum (odd numbers).
> 
> The way quorum works is: The majority of the quorum nodes need to be up to survive an outage.
> 
> - With 3 quorum nodes you can have 1 quorum node failures and continue filesystem operations.
> - With 5 quorum nodes you can have 2 quorum node failures and continue filesystem operations.
> - With 7 quorum nodes you can have 3 quorum node failures and continue filesystem operations.
> - etc
> 
> Please see http://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/ibmspectrumscale42_content.html?view=kc <http://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/ibmspectrumscale42_content.html?view=kc>for more information about quorum and tiebreaker disks.
> 
> Ken Hill
> Technical Sales Specialist | Software Defined Solution Sales
> IBM Systems
> Phone:1-540-207-7270
> E-mail: kenh at us.ibm.com <mailto:kenh at us.ibm.com>	
> <image012.png> <http://www.ibm.com/us-en/>  <image013.png> <http://www-03.ibm.com/systems/platformcomputing/products/lsf/>  <image014.png> <http://www-03.ibm.com/systems/platformcomputing/products/high-performance-services/index.html> <image015.png> <http://www-03.ibm.com/systems/platformcomputing/products/symphony/index.html>  <image016.png> <http://www-03.ibm.com/systems/storage/spectrum/>  <image017.png> <http://www-01.ibm.com/software/tivoli/csi/cloud-storage/>  <image018.png> <http://www-01.ibm.com/software/tivoli/csi/backup-recovery/>  <image019.png> <http://www-03.ibm.com/systems/storage/tape/ltfs/index.html>  <image020.png> <http://www-03.ibm.com/systems/storage/spectrum/>  <image021.png> <http://www-03.ibm.com/systems/storage/spectrum/scale/>  <image022.png> <https://www.ibm.com/marketplace/cloud/object-storage/us/en-us> 
> 
> 2300 Dulles Station Blvd
> Herndon, VA 20171-6133
> United States
> 
> 
> 
> 
> 
> 
> From:        "Mark.Bush at siriuscom.com <mailto:Mark.Bush at siriuscom.com>" <Mark.Bush at siriuscom.com <mailto:Mark.Bush at siriuscom.com>>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Date:        07/20/2016 04:47 PM
> Subject:        [gpfsug-discuss] NDS in Two Site scenario
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> 
> 
> 
> For some reason this concept is a round peg that doesn’t fit the square hole inside my brain.  Can someone please explain the best practice to setting up two sites same cluster?  I get that I would likely have two NDS nodes in site 1 and two NDS nodes in site two.  What I don’t understand are the failure scenarios and what would happen if I lose one or worse a whole site goes down.  Do I solve this by having scale replication set to 2 for all my files?  I mean a single site I think I get it’s when there are two datacenters and I don’t want two clusters typically.
> 
> 
> 
> Mark R. Bush| Solutions Architect
> Mobile: 210.237.8415 | mark.bush at siriuscom.com <mailto:mark.bush at siriuscom.com>
> Sirius Computer Solutions | www.siriuscom.com <http://www.siriuscom.com/>
> 10100 Reunion Place, Suite 500, San Antonio, TX 78216 
>   
> This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. This message may be viewed by parties at Sirius Computer Solutions other than those named in the message header. This message does not contain an official representation of Sirius Computer Solutions. If you have received this communication in error, notify Sirius Computer Solutions immediately and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you.
> 
> Sirius Computer Solutions <http://www.siriuscom.com/>_______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>  _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>  
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160721/192b1b18/attachment-0002.htm>