[gpfsug-discuss] Services on DSS/ESS nodes

Luis Bolinches luis.bolinches at fi.ibm.com
Sat Oct 3 12:19:36 BST 2020

Are you mixing those ESS DSS in the same cluster? Or you are only running DSS


Mixing DSS and ESS in the same cluster is not a supported configuration.

You really need to talk with Lenovo as is your vendor. The fact that in your region your support is being given by an IBMer or not is not a relevant point. High enough in the chain always will end at IBM on any region as GNR is IBM tech for 17 years (yes 17) so if weird enough even on regions where Lenovo might do even third level it might end on development and/or research. But that is a Lenovo/IBM agreement not you and IBM. 

So please get the support statement from Lenovo about this and pls share it if you want/can so we all learn their position. 



> On 3. Oct 2020, at 13.55, Andrew Beattie <abeattie at au1.ibm.com> wrote:
> Why do you need to run any kind of monitoring client on an IO server the GUI / performance monitor already does all of that work for you and collects the data on the dedicated EMS server.
> If you have a small storage environment the. Yes the processor and memory may feel like overkill, but tuned appropriately an IO server will use all the memory you can give it to drive IO performance, 
> If you want to run a hybrid / non standard architecture then the IBM ESS / DGSS platform may not be the right platform in comparison to a build your own architecture, how ever you then take all the support issues onto your self rather than it being the vendors problem. 
> Sent from my iPhone
> > On 3 Oct 2020, at 20:06, Jonathan Buzzard <jonathan.buzzard at strath.ac.uk> wrote:
> > 
> > On 02/10/2020 23:19, Andrew Beattie wrote:
> >> Jonathan,
> >> I suggest you get a formal statement from Lenovo as the DSS-G Platform 
> >> is no longer an IBM platform.
> >> 
> >> But for ESS based platforms the answer would be, it is not supported to 
> >> run anything on the IO Servers other than GNR and the relevant Scale 
> >> management services, due to the fact that if you lose an IO Server, or 
> >> if you in an extended maintenance window the Server needs to host all 
> >> the work that would be being performed by both IO servers.
> >> 
> > 
> > In the past ~500 days the Infiniband to Ethernet gateway has shifted 
> > ~13GB of data, or about 25MB a day. Meanwhile in the last 470 days the 
> > DSS-G nodes have each shifted several PB. The proposed additional 
> > traffic is a drop in the ocean.
> > 
> > On my actual routers which shift much more data (over 300TB externally) 
> > with an uptime of ~180 days at the moment the CPU time consumed by 
> > keepalived is just under 31 minutes or about 8 seconds a day. These are 
> > much punier CPU's too. The proposed additional CPU usage is another drop 
> > in the ocean.
> > 
> > Given Lenovo sold the *same* configuration with x3650's and SR650's the 
> > "need all the CPU grunt" is somewhat fishy. Between the bid being 
> > submitted and actual tender award the SR650's came out and we paid a bit 
> > extra to uplift to the newer server hardware with exactly the same disk 
> > configuration. I believe IBM have done the same with the ESS/GNR servers 
> > too over time the same applies there too.
> > 
> > IMHO given keepalived is a base RHEL package, IBM/Lenovo should be 
> > offering running Infiniband to Ethernet gateways on the DSS/ESS nodes as 
> > a supported configuration for mixed network technology clusters :-)
> > 
> > Running a couple extra servers for this purpose is obnoxious from an 
> > environmental standpoint. That's IBM's green credentials out the window 
> > if you ask me.
> > 
> > I would note under those rules running a Nagios, Zabbix etc. client on 
> > the nodes is not permitted either. I would suggest that most sites would 
> > be rather unhappy about that :-)
> > 
> > 
> >> I don't know if Lenovo have different point if view.
> >> 
> > 
> > Problem is when I ring up for support on my DSS-G I speak to an IBM 
> > employee not a Lenovo one :-)
> > 
> > 
> > JAB.
> > 
> > -- 
> > Jonathan A. Buzzard Tel: +44141-5483420
> > HPC System Administrator, ARCHIE-WeSt.
> > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> > 

Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3 
Registered in Finland

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20201003/7806094c/attachment-0002.htm>

More information about the gpfsug-discuss mailing list