[gpfsug-discuss] Small cluster

Chris Scott chrisjscott at gmail.com
Tue Mar 8 18:58:29 GMT 2016


My fantasy solution is 2 servers and a SAS disk shelf from my adopted,
cheap x86 vendor running IBM Spectrum Scale with GNR as software only,
doing concurrent, supported GNR and CES with maybe an advisory on the
performance requirements of such and suggestions on scale out approaches :)

Cheers
Chris

On 7 March 2016 at 21:10, Mark.Bush at siriuscom.com <Mark.Bush at siriuscom.com>
wrote:

> Thanks Yuri, this solidifies some of the conclusions I’ve drawn from this
> conversation.  Thank you all for your responses.  This is a great forum
> filled with very knowledgeable folks.
>
> Mark
>
> From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Yuri L
> Volobuev <volobuev at us.ibm.com>
> Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date: Monday, March 7, 2016 at 2:58 PM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Small cluster
>
> This use case is a good example of how it's hard to optimize across
> multiple criteria.
>
> If you want a pre-packaged solution that's proven and easy to manage,
> StorWize V7000 Unified is the ticket. Design-wise, it's as good a fit for
> your requirements as such things get. Price may be an issue though, as
> usual.
>
> If you're OK with rolling your own complex solution, my recommendation
> would be to use a low-end shared (twin-tailed, via SAS or FC SAN) external
> disk solution, with 2-3 GPFS nodes accessing the disks directly, i.e. via
> the local block device interface. This avoids the pitfalls of data/metadata
> replication, and offers a decent blend of performance, fault tolerance, and
> disk management. You can use disk-based quorum if going with 2 nodes, or
> traditional node majority quorum if using 3 nodes, either way would work.
> There's no need to do any separation of roles (CES, quorum, managers, etc),
> provided the nodes are adequately provisioned with memory and aren't
> routinely overloaded, in which case you just need to add more nodes instead
> of partitioning what you have.
>
> Using internal disks and relying on GPFS data/metadata replication, with
> or without FPO, would mean taking the hard road. You may be able to spend
> the least on hardware in such a config (although the 33% disk utilization
> rate for triplication makes this less clear, if capacity is an issue), but
> the operational challenges are going to be substantial. This would be a
> viable config, but there are unavoidable tradeoffs caused by replication:
> (1) writes are very expensive, which limits the overall cluster capability
> for non-read-only workloads, (2) node and disk failures require a round of
> re-replication, or "re-protection", which takes time and bandwidth,
> limiting the overall capability further, (3) disk management can be a
> challenge, as there's no software/hardware component to assist with
> identifying failing/failed disks. As far as not going off the beaten path,
> this is not it... Exporting protocols from a small triplicated file system
> is not a typical mode of deployment of Spectrum Scale, you'd be blazing
> some new trails.
>
> As stated already in several responses, there's no hard requirement that
> CES Protocol nodes must be entirely separate from any other roles in the
> general Spectrum Scale deployment scenario. IBM expressly disallows
> co-locating Protocol nodes with ESS servers, due to resource consumption
> complications, but for non-ESS cases it's merely a recommendation to run
> Protocols on nodes that are not otherwise encumbered by having to provide
> other services. Of course, the config that's the best for performance is
> not the cheapest. CES doesn't reboot nodes to recover from NFS problems,
> unlike cNFS (which has to, given its use of kernel NFS stack). Of course, a
> complex software stack is a complex software stack, so there's greater
> potential for things to go sideways, in particular due to the lack of
> resources.
>
> FPO vs plain replication: this only matters if you have apps that are
> capable of exploiting data locality. FPO changes the way GPFS stripes data
> across disks. Without FPO, GPFS does traditional wide striping of blocks
> across all disks in a given storage pool. When FPO is in use, data in large
> files is divided in large (e.g. 1G) chunks, and there's a node that holds
> an entire chunk on its internal disks. An application that knows how to
> query data block layout of a given file can then schedule the job that
> needs to read from this chunk on the node that holds a local copy. This
> makes a lot of sense for integrated data analytics workloads, a la Map
> Reduce with Hadoop, but doesn't make sense for generic apps like Samba.
>
> I'm not sure what language in the FAQ creates the impression that the SAN
> deployment model is somehow incompatible with running Procotol services.
> This is perfectly fine.
>
> yuri
>
> [image: Inactive hide details for Jan-Frode Myklebust ---03/06/2016
> 10:12:07 PM---I agree, but would also normally want to stay within]Jan-Frode
> Myklebust ---03/06/2016 10:12:07 PM---I agree, but would also normally want
> to stay within whatever is recommended.
>
> From: Jan-Frode Myklebust <janfrode at tanso.net>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>,
> Date: 03/06/2016 10:12 PM
> Subject: Re: [gpfsug-discuss] Small cluster
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> I agree, but would also normally want to stay within whatever is
> recommended.
>
> What about quorum/manager functions? Also OK to run these on the CES nodes
> in a 2-node cluster, or any reason to partition these out so that we then
> have a 4-node cluster running on 2 physical machines?
>
>
> -jf
> søn. 6. mar. 2016 kl. 21.28 skrev Marc A Kaplan <*makaplan at us.ibm.com*
> <makaplan at us.ibm.com>>:
>
>    As Sven wrote, the FAQ does not "prevent" anything.  It's just a
>    recommendation someone came up with.  Which may or may not apply to your
>    situation.
>
>    Partitioning a server into two servers might be a good idea if you
>    really need the protection/isolation.  But I expect you are limiting the
>    potential performance of the overall system, compared to running a single
>    Unix image with multiple processes that can share resource and communicate
>    more freely.
>
>
>
>
>    _______________________________________________
>    gpfsug-discuss mailing list
>    gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org/>
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
>    <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>    _______________________________________________
>    gpfsug-discuss mailing list
>    gpfsug-discuss at spectrumscale.org
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> This message (including any attachments) is intended only for the use of
> the individual or entity to which it is addressed and may contain
> information that is non-public, proprietary, privileged, confidential, and
> exempt from disclosure under applicable law. If you are not the intended
> recipient, you are hereby notified that any use, dissemination,
> distribution, or copying of this communication is strictly prohibited. This
> message may be viewed by parties at Sirius Computer Solutions other than
> those named in the message header. This message does not contain an
> official representation of Sirius Computer Solutions. If you have received
> this communication in error, notify Sirius Computer Solutions immediately
> and (i) destroy this message if a facsimile or (ii) delete this message
> immediately if this is an electronic communication. Thank you.
> Sirius Computer Solutions <http://www.siriuscom.com>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160308/29617917/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0B132319.gif
Type: image/gif
Size: 21994 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160308/29617917/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160308/29617917/attachment-0005.gif>


More information about the gpfsug-discuss mailing list