[gpfsug-discuss] Preferred NSD

Stephen Ulmer ulmer at ulmer.org
Wed Mar 14 18:59:29 GMT 2018


Depending on the size... I just quoted something both ways and DME (which is Advanced Edition equivalent) was about $400K cheaper than Standard Edition socket pricing for this particular customer and use case. It all depends.

Also, for the case where the OP wants to distribute the file system around on NVMe in *every* node, there is always the FPO license. The FPO license can share NSDs with other FPO licensed nodes and servers (just not clients).

-- 
Stephen



> On Mar 14, 2018, at 1:33 PM, Sobey, Richard A <r.sobey at imperial.ac.uk <mailto:r.sobey at imperial.ac.uk>> wrote:
> 
>>> 2. Have data management edition and capacity license the amount of storage.
> There goes the budget 😉
> 
> Richard
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org> <gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>> On Behalf Of Simon Thompson (IT Research Support)
> Sent: 14 March 2018 16:54
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: Re: [gpfsug-discuss] Preferred NSD
> 
> Not always true.
> 
> 1. Use them with socket licenses as HAWC or LROC is OK on a client.
> 2. Have data management edition and capacity license the amount of storage.
> 
> Simon
> ________________________________________
> From: gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org> [gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>] on behalf of Jeffrey R. Lang [JRLang at uwyo.edu <mailto:JRLang at uwyo.edu>]
> Sent: 14 March 2018 14:11
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] Preferred NSD
> 
> Something I haven't heard in this discussion, it that of licensing of GPFS.
> 
> I believe that once you export disks from a node it then becomes a server node and the license may need to be changed, from client to server.  There goes the budget.
> 
> 
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org> <gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>> On Behalf Of Lukas Hejtmanek
> Sent: Wednesday, March 14, 2018 4:28 AM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: Re: [gpfsug-discuss] Preferred NSD
> 
> Hello,
> 
> thank you for insight. Well, the point is, that I will get ~60 with 120 NVMe disks in it, each about 2TB size. It means that I will have 240TB in NVMe SSD that could build nice shared scratch. Moreover, I have no different HW or place to put these SSDs into. They have to be in the compute nodes.
> 
> On Tue, Mar 13, 2018 at 10:48:21AM -0700, Alex Chekholko wrote:
>> I would like to discourage you from building a large distributed 
>> clustered filesystem made of many unreliable components.  You will 
>> need to overprovision your interconnect and will also spend a lot of 
>> time in "healing" or "degraded" state.
>> 
>> It is typically cheaper to centralize the storage into a subset of 
>> nodes and configure those to be more highly available.  E.g. of your
>> 60 nodes, take 8 and put all the storage into those and make that a 
>> dedicated GPFS cluster with no compute jobs on those nodes.  Again, 
>> you'll still need really beefy and reliable interconnect to make this work.
>> 
>> Stepping back; what is the actual problem you're trying to solve?  I 
>> have certainly been in that situation before, where the problem is 
>> more like: "I have a fixed hardware configuration that I can't change, 
>> and I want to try to shoehorn a parallel filesystem onto that."
>> 
>> I would recommend looking closer at your actual workloads.  If this is 
>> a "scratch" filesystem and file access is mostly from one node at a 
>> time, it's not very useful to make two additional copies of that data 
>> on other nodes, and it will only slow you down.
>> 
>> Regards,
>> Alex
>> 
>> On Tue, Mar 13, 2018 at 7:16 AM, Lukas Hejtmanek 
>> <xhejtman at ics.muni.cz <mailto:xhejtman at ics.muni.cz>>
>> wrote:
>> 
>>> On Tue, Mar 13, 2018 at 10:37:43AM +0000, John Hearns wrote:
>>>> Lukas,
>>>> It looks like you are proposing a setup which uses your compute 
>>>> servers
>>> as storage servers also?
>>> 
>>> yes, exactly. I would like to utilise NVMe SSDs that are in every 
>>> compute servers.. Using them as a shared scratch area with GPFS is 
>>> one of the options.
>>> 
>>>> 
>>>>  *   I'm thinking about the following setup:
>>>> ~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB 
>>>> interconnected
>>>> 
>>>> There is nothing wrong with this concept, for instance see 
>>>> https://www.beegfs.io/wiki/BeeOND <https://www.beegfs.io/wiki/BeeOND>
>>>> 
>>>> I have an NVMe filesystem which uses 60 drives, but there are 10 servers.
>>>> You should look at "failure zones" also.
>>> 
>>> you still need the storage servers and local SSDs to use only for 
>>> caching, do I understand correctly?
>>> 
>>>> 
>>>> From: gpfsug-discuss-bounces at spectrumscale.org <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>>>> [mailto:gpfsug-discuss-
>>> bounces at spectrumscale.org <mailto:bounces at spectrumscale.org>] On Behalf Of Knister, Aaron S.
>>> (GSFC-606.2)[COMPUTER SCIENCE CORP]
>>>> Sent: Monday, March 12, 2018 4:14 PM
>>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org <mailto:gpfsug-discuss at spectrumscale.org>>
>>>> Subject: Re: [gpfsug-discuss] Preferred NSD
>>>> 
>>>> Hi Lukas,
>>>> 
>>>> Check out FPO mode. That mimics Hadoop's data placement features.
>>>> You
>>> can have up to 3 replicas both data and metadata but still the 
>>> downside, though, as you say is the wrong node failures will take your cluster down.
>>>> 
>>>> You might want to check out something like Excelero's NVMesh
>>>> (note: not
>>> an endorsement since I can't give such things) which can create 
>>> logical volumes across all your NVMe drives. The product has erasure 
>>> coding on their roadmap. I'm not sure if they've released that 
>>> feature yet but in theory it will give better fault tolerance *and* 
>>> you'll get more efficient usage of your SSDs.
>>>> 
>>>> I'm sure there are other ways to skin this cat too.
>>>> 
>>>> -Aaron
>>>> 
>>>> 
>>>> 
>>>> On March 12, 2018 at 10:59:35 EDT, Lukas Hejtmanek 
>>>> <xhejtman at ics.muni.cz <mailto:xhejtman at ics.muni.cz>
>>> <mailto:xhejtman at ics.muni.cz <mailto:xhejtman at ics.muni.cz>>> wrote:
>>>> Hello,
>>>> 
>>>> I'm thinking about the following setup:
>>>> ~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB 
>>>> interconnected
>>>> 
>>>> I would like to setup shared scratch area using GPFS and those 
>>>> NVMe
>>> SSDs. Each
>>>> SSDs as on NSD.
>>>> 
>>>> I don't think like 5 or more data/metadata replicas are practical here.
>>> On the
>>>> other hand, multiple node failures is something really expected.
>>>> 
>>>> Is there a way to instrument that local NSD is strongly preferred 
>>>> to
>>> store
>>>> data? I.e. node failure most probably does not result in 
>>>> unavailable
>>> data for
>>>> the other nodes?
>>>> 
>>>> Or is there any other recommendation/solution to build shared 
>>>> scratch
>>> with
>>>> GPFS in such setup? (Do not do it including.)
>>>> 
>>>> --
>>>> Lukáš Hejtmánek
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/> 
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>>> -- The information contained in this communication and any 
>>>> attachments
>>> is confidential and may be privileged, and is for the sole use of 
>>> the intended recipient(s). Any unauthorized review, use, disclosure 
>>> or distribution is prohibited. Unless explicitly stated otherwise in 
>>> the body of this communication or the attachment thereto (if any), 
>>> the information is provided on an AS-IS basis without any express or 
>>> implied warranties or liabilities. To the extent you are relying on 
>>> this information, you are doing so at your own risk. If you are not 
>>> the intended recipient, please notify the sender immediately by 
>>> replying to this message and destroy all copies of this message and 
>>> any attachments. Neither the sender nor the company/group of 
>>> companies he or she represents shall be liable for the proper and 
>>> complete transmission of the information contained in this communication, or for any delay in its receipt.
>>> 
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/> 
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>> 
>>> 
>>> --
>>> Lukáš Hejtmánek
>>> 
>>> Linux Administrator only because
>>>  Full Time Multitasking Ninja
>>>  is not an official job title
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>> 
> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> 
> 
> --
> Lukáš Hejtmánek
> 
> Linux Administrator only because
>  Full Time Multitasking Ninja
>  is not an official job title
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180314/10560274/attachment-0002.htm>


More information about the gpfsug-discuss mailing list