[gpfsug-discuss] Services on DSS/ESS nodes

Andrew Beattie abeattie at au1.ibm.com
Sat Oct 3 11:55:05 BST 2020



Why do you need to run any kind of monitoring client on an IO server the
GUI / performance monitor already does all of that work for you and
collects the data on the dedicated EMS server.

If you have a small storage environment the. Yes the processor and memory
may feel like overkill, but tuned appropriately an IO server will use all
the memory you can give it to drive IO performance,

If you want to run a hybrid / non standard architecture then the IBM ESS /
DGSS platform may not be the right platform in comparison to a build your
own architecture, how ever you then take all the support issues onto your
self rather than it being the vendors problem.

Sent from my iPhone

> On 3 Oct 2020, at 20:06, Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
wrote:
>
> On 02/10/2020 23:19, Andrew Beattie wrote:
>> Jonathan,
>> I suggest you get a formal statement from Lenovo as the DSS-G Platform
>> is no longer an IBM platform.
>>
>> But for ESS based platforms the answer would be, it is not supported to
>> run anything on the IO Servers other than GNR and the relevant Scale
>> management services, due to the fact that if you lose an IO Server, or
>> if you in an extended maintenance window the Server needs to host all
>> the work that would be being performed by both IO servers.
>>
>
> In the past ~500 days the Infiniband to Ethernet gateway has shifted
> ~13GB of data, or about 25MB a day. Meanwhile in the last 470 days the
> DSS-G nodes have each shifted several PB. The proposed additional
> traffic is a drop in the ocean.
>
> On my actual routers which shift much more data (over 300TB externally)
> with an uptime of ~180 days at the moment the CPU time consumed by
> keepalived is just under 31 minutes or about 8 seconds a day. These are
> much punier CPU's too. The proposed additional CPU usage is another drop
> in the ocean.
>
> Given Lenovo sold the *same* configuration with x3650's and SR650's the
> "need all the CPU grunt" is somewhat fishy. Between the bid being
> submitted and actual tender award the SR650's came out and we paid a bit
> extra to uplift to the newer server hardware with exactly the same disk
> configuration. I believe IBM have done the same with the ESS/GNR servers
> too over time the same applies there too.
>
> IMHO given keepalived is a base RHEL package, IBM/Lenovo should be
> offering running Infiniband to Ethernet gateways on the DSS/ESS nodes as
> a supported configuration for mixed network technology clusters :-)
>
> Running a couple extra servers for this purpose is obnoxious from an
> environmental standpoint. That's IBM's green credentials out the window
> if you ask me.
>
> I would note under those rules running a Nagios, Zabbix etc. client on
> the nodes is not permitted either. I would suggest that most sites would
> be rather unhappy about that :-)
>
>
>> I don't know if Lenovo have different point if view.
>>
>
> Problem is when I ring up for support on my DSS-G I speak to an IBM
> employee not a Lenovo one :-)
>
>
> JAB.
>
> --
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20201003/b5de5e36/attachment-0002.htm>


More information about the gpfsug-discuss mailing list