[gpfsug-discuss] Small cluster

Sven Oehme oehmes at us.ibm.com
Fri Mar 4 18:03:16 GMT 2016


Hi,

a couple of comments to the various infos in this thread.

1. the need to run CES on separate nodes is a recommendation, not a
requirement and the recommendation comes from the fact that if you have
heavy loaded NAS traffic that gets the system to its knees, you can take
your NSD service down with you if its on the same box. so as long as you
have a reasonable performance expectation and size the system correct there
is no issue.

2. shared vs FPO vs shared nothing (just replication) . the main issue
people overlook in this scenario is the absence of read/write caches in FPO
or shared nothing configurations. every physical disk drive can only do
~100 iops and thats independent if the io size is 1 byte or 1 megabyte its
pretty much the same effort. particular on metadata this bites you really
badly as every of this tiny i/os eats one of your 100 iops a disk can do
and quickly you used up all your iops on the drives. if you have any form
of raid controller (sw or hw) it typically implements at minimum a read
cache on most systems a read/write cache which will significant increase
the number of logical i/os one can do against a disk , my best example is
always if you have a workload that does 4k seq DIO writes to a single disk,
if you have no raid controller you can do 400k/sec in this workload if you
have a reasonable ok write cache in front of the cache you can do 50 times
that much. so especilly if you use snapshots, CES services or anything
thats metadata intensive you want some type of raid protection with
caching. btw. replication in the FS makes this even worse as now each write
turns into 3 iops for the data + additional iops for the log records so you
eat up your iops very quick .

3. instead of shared SAN a shared SAS device is significantly cheaper but
only scales to 2-4 nodes , the benefit is you only need 2 instead of 3
nodes as you can use the disks as tiebreaker disks. if you also add some
SSD's for the metadata and make use of HAWC and LROC you might get away
from not needing a raid controller with cache as HAWC will solve that issue
for you .

just a few thoughts :-D

sven


------------------------------------------
Sven Oehme
Scalable Storage Research
email: oehmes at us.ibm.com
Phone: +1 (408) 824-8904
IBM Almaden Research Lab
------------------------------------------



From:	Zachary Giles <zgiles at gmail.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	03/04/2016 05:36 PM
Subject:	Re: [gpfsug-discuss] Small cluster
Sent by:	gpfsug-discuss-bounces at spectrumscale.org



SMB too, eh? See this is where it starts to get hard to scale down. You
could do a 3 node GPFS cluster with replication at remote sites, pulling in
from AFM over the Net. If you want SMB too, you're probably going to need
another pair of servers to act as the Protocol Servers on top of the 3 GPFS
servers. I think running them all together is not recommended, and probably
I'd agree with that.
Though, you could do it anyway. If it's for read-only and updated daily,
eh, who cares. Again, depends on your GPFS experience and the balance
between production, price, and performance :)

On Fri, Mar 4, 2016 at 11:30 AM, Mark.Bush at siriuscom.com <
Mark.Bush at siriuscom.com> wrote:
  Yes.  Really the only other option we have (and not a bad one) is getting
  a v7000 Unified in there (if we can get the price down far enough).
  That’s not a bad option since all they really want is SMB shares in the
  remote.  I just keep thinking a set of servers would do the trick and be
  cheaper.



  From: Zachary Giles <zgiles at gmail.com>
  Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
  Date: Friday, March 4, 2016 at 10:26 AM

  To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
  Subject: Re: [gpfsug-discuss] Small cluster

  You can do FPO for non-Hadoop workloads. It just alters the disks below
  the GPFS filesystem layer and looks like a normal GPFS system (mostly).
  I do think there were some restrictions on non-FPO nodes mounting FPO
  filesystems via multi-cluster.. not sure if those are still there.. any
  input on that from IBM?

  If small enough data, and with 3-way replication, it might just be wise
  to do internal storage and 3x rep. A 36TB 2U server is ~$10K (just common
  throwing out numbers), 3 of those per site would fit in your budget.

  Again.. depending on your requirements, stability balance between
  'science experiment' vs production, GPFS knowledge level, etc etc...

  This is actually an interesting and somewhat missing space for small
  enterprises. If you just want 10-20TB active-active online everywhere,
  say, for VMware, or NFS, or something else, there arent all that many
  good solutions today that scale down far enough and are a decent price.
  It's easy with many many PB, but small.. idk. I think the above sounds
  good as anything without going SAN-crazy.



  On Fri, Mar 4, 2016 at 11:21 AM, Mark.Bush at siriuscom.com <
  Mark.Bush at siriuscom.com> wrote:
   I guess this is really my question.  Budget is less than $50k per site
   and they need around 20TB storage.  Two nodes with MD3 or something may
   work.  But could it work (and be successful) with just servers and
   internal drives?  Should I do FPO for non hadoop like workloads?  I
   didn’t think I could get native raid except in the ESS (GSS no longer
   exists if I remember correctly).  Do I just make replicas and call it
   good?


   Mark

   From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Marc A
   Kaplan <makaplan at us.ibm.com>
   Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
   Date: Friday, March 4, 2016 at 10:09 AM
   To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
   Subject: Re: [gpfsug-discuss] Small cluster

   Jon, I don't doubt your experience, but it's not quite fair or even
   sensible to make a decision today based on what was available in the
   GPFS 2.3 era.

   We are now at GPFS 4.2 with support for 3 way replication and FPO.
   Also we have Raid controllers, IB, and "Native Raid" and ESS, GSS
   solutions and more.

   So more choices, more options, making finding an "optimal" solution more
   difficult.

   To begin with, as with any provisioning problem, one should try to
   state: requirements, goals, budgets, constraints, failure/tolerance
   models/assumptions,
   expected workloads, desired performance, etc, etc.




   This message (including any attachments) is intended only for the use of
   the individual or entity to which it is addressed and may contain
   information that is non-public, proprietary, privileged, confidential,
   and exempt from disclosure under applicable law. If you are not the
   intended recipient, you are hereby notified that any use, dissemination,
   distribution, or copying of this communication is strictly prohibited.
   This message may be viewed by parties at Sirius Computer Solutions other
   than those named in the message header. This message does not contain an
   official representation of Sirius Computer Solutions. If you have
   received this communication in error, notify Sirius Computer Solutions
   immediately and (i) destroy this message if a facsimile or (ii) delete
   this message immediately if this is an electronic communication. Thank
   you.


   Sirius Computer Solutions

   _______________________________________________
   gpfsug-discuss mailing list
   gpfsug-discuss at spectrumscale.org
   http://gpfsug.org/mailman/listinfo/gpfsug-discuss




  --
  Zach Giles
  zgiles at gmail.com

  _______________________________________________
  gpfsug-discuss mailing list
  gpfsug-discuss at spectrumscale.org
  http://gpfsug.org/mailman/listinfo/gpfsug-discuss




--
Zach Giles
zgiles at gmail.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160304/dd661d27/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160304/dd661d27/attachment-0002.gif>


More information about the gpfsug-discuss mailing list