[gpfsug-discuss] Preferred NSD

Tue Mar 13 14:16:30 GMT 2018

On Tue, Mar 13, 2018 at 10:37:43AM +0000, John Hearns wrote:
> Lukas,
> It looks like you are proposing a setup which uses your compute servers as storage servers also?

yes, exactly. I would like to utilise NVMe SSDs that are in every compute
servers.. Using them as a shared scratch area with GPFS is one of the options.

> 
>   *   I'm thinking about the following setup:
> ~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB interconnected
> 
> There is nothing wrong with this concept, for instance see
> https://www.beegfs.io/wiki/BeeOND
> 
> I have an NVMe filesystem which uses 60 drives, but there are 10 servers.
> You should look at "failure zones" also.

you still need the storage servers and local SSDs to use only for caching, do
I understand correctly?

> 
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]
> Sent: Monday, March 12, 2018 4:14 PM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Preferred NSD
> 
> Hi Lukas,
> 
> Check out FPO mode. That mimics Hadoop's data placement features. You can have up to 3 replicas both data and metadata but still the downside, though, as you say is the wrong node failures will take your cluster down.
> 
> You might want to check out something like Excelero's NVMesh (note: not an endorsement since I can't give such things) which can create logical volumes across all your NVMe drives. The product has erasure coding on their roadmap. I'm not sure if they've released that feature yet but in theory it will give better fault tolerance *and* you'll get more efficient usage of your SSDs.
> 
> I'm sure there are other ways to skin this cat too.
> 
> -Aaron
> 
> 
> 
> On March 12, 2018 at 10:59:35 EDT, Lukas Hejtmanek <xhejtman at ics.muni.cz<mailto:xhejtman at ics.muni.cz>> wrote:
> Hello,
> 
> I'm thinking about the following setup:
> ~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB interconnected
> 
> I would like to setup shared scratch area using GPFS and those NVMe SSDs. Each
> SSDs as on NSD.
> 
> I don't think like 5 or more data/metadata replicas are practical here. On the
> other hand, multiple node failures is something really expected.
> 
> Is there a way to instrument that local NSD is strongly preferred to store
> data? I.e. node failure most probably does not result in unavailable data for
> the other nodes?
> 
> Or is there any other recommendation/solution to build shared scratch with
> GPFS in such setup? (Do not do it including.)
> 
> --
> Lukáš Hejtmánek
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.

> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-- 
Lukáš Hejtmánek

Linux Administrator only because
  Full Time Multitasking Ninja 
  is not an official job title