[gpfsug-discuss] Services on DSS/ESS nodes
Jonathan Buzzard
jonathan.buzzard at strath.ac.uk
Sat Oct 3 11:06:41 BST 2020
On 02/10/2020 23:19, Andrew Beattie wrote:
> Jonathan,
> I suggest you get a formal statement from Lenovo as the DSS-G Platform
> is no longer an IBM platform.
>
> But for ESS based platforms the answer would be, it is not supported to
> run anything on the IO Servers other than GNR and the relevant Scale
> management services, due to the fact that if you lose an IO Server, or
> if you in an extended maintenance window the Server needs to host all
> the work that would be being performed by both IO servers.
>
In the past ~500 days the Infiniband to Ethernet gateway has shifted
~13GB of data, or about 25MB a day. Meanwhile in the last 470 days the
DSS-G nodes have each shifted several PB. The proposed additional
traffic is a drop in the ocean.
On my actual routers which shift much more data (over 300TB externally)
with an uptime of ~180 days at the moment the CPU time consumed by
keepalived is just under 31 minutes or about 8 seconds a day. These are
much punier CPU's too. The proposed additional CPU usage is another drop
in the ocean.
Given Lenovo sold the *same* configuration with x3650's and SR650's the
"need all the CPU grunt" is somewhat fishy. Between the bid being
submitted and actual tender award the SR650's came out and we paid a bit
extra to uplift to the newer server hardware with exactly the same disk
configuration. I believe IBM have done the same with the ESS/GNR servers
too over time the same applies there too.
IMHO given keepalived is a base RHEL package, IBM/Lenovo should be
offering running Infiniband to Ethernet gateways on the DSS/ESS nodes as
a supported configuration for mixed network technology clusters :-)
Running a couple extra servers for this purpose is obnoxious from an
environmental standpoint. That's IBM's green credentials out the window
if you ask me.
I would note under those rules running a Nagios, Zabbix etc. client on
the nodes is not permitted either. I would suggest that most sites would
be rather unhappy about that :-)
> I don't know if Lenovo have different point if view.
>
Problem is when I ring up for support on my DSS-G I speak to an IBM
employee not a Lenovo one :-)
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
More information about the gpfsug-discuss
mailing list