<div dir="ltr">Hi,<div>This is a settings, we had the exact same issue in the past and when we change the following parameters it has never killed the CES nodes anymore.</div><div><br></div><div><p style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;line-height:1.5em;font-family:"IBM Plex Sans","Helvetica Neue",Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;font-size:12px;vertical-align:baseline;letter-spacing:0px;color:rgb(22,22,22)"><span style="box-sizing:border-box;font-weight:600;margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;font-size:0.75rem;vertical-align:baseline">maxFilesToCache=1000000</span></p><p style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;line-height:1.5em;font-family:"IBM Plex Sans","Helvetica Neue",Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;font-size:12px;vertical-align:baseline;letter-spacing:0px;color:rgb(22,22,22)"><span style="box-sizing:border-box;font-weight:600;margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;font-size:0.75rem;vertical-align:baseline">maxStatCache=100000</span></p><p style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;line-height:1.5em;font-family:"IBM Plex Sans","Helvetica Neue",Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;font-size:12px;vertical-align:baseline;letter-spacing:0px;color:rgb(22,22,22)"><span style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;font-size:0.75rem;vertical-align:baseline"><br></span></p><p style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;line-height:1.5em;font-family:"IBM Plex Sans","Helvetica Neue",Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;font-size:12px;vertical-align:baseline;letter-spacing:0px;color:rgb(22,22,22)"><span style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;font-size:0.75rem;vertical-align:baseline">Thanks</span></p><p style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;line-height:1.5em;font-family:"IBM Plex Sans","Helvetica Neue",Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;font-size:12px;vertical-align:baseline;letter-spacing:0px;color:rgb(22,22,22)"><span style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;font-size:0.75rem;vertical-align:baseline">Christian</span></p></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 6 Sept 2023 at 20:59, Christoph Martin <<a href="mailto:martin@uni-mainz.de">martin@uni-mainz.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>
<br>
on a three node GPFS cluster with CES enabled and AFM-DR mirroring to a <br>
second cluster we see frequent OOM killer events due to a constantly <br>
growing mmfsd.<br>
The machines have 256G memory. The pagepool is configured to 16G.<br>
The GPFS version is 5.1.6-1.<br>
After a restart mmfsd rapidly grows to about 100G usage and grows over <br>
some days up to 250G virtual and 220G physical memory usage.<br>
OOMkiller tries kill process like pmcollector or others and sometime <br>
kills mmfsd.<br>
<br>
Does anybody see a similar behavior?<br>
Any guess what could help with this problem?<br>
<br>
Regards<br>
Christoph Martin<br>
<br>
-- <br>
Christoph Martin<br>
Zentrum für Datenverarbeitung (ZDV)<br>
Leiter Unix & Cloud<br>
<br>
Johannes Gutenberg-Universität Mainz<br>
Anselm Franz von Bentzel-Weg 12, 55128 Mainz<br>
Tel: +49 6131 39 26337<br>
<a href="mailto:martin@uni-mainz.de" target="_blank">martin@uni-mainz.de</a><br>
<a href="http://www.zdv.uni-mainz.de" rel="noreferrer" target="_blank">www.zdv.uni-mainz.de</a><br>
<br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://gpfsug.org" rel="noreferrer" target="_blank">gpfsug.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org</a><br>
</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr">Med Vänliga Hälsningar<div>Christian Petersson</div><div><br></div><div>E-Post: <a href="mailto:Christian.Petersson@isstech.io" target="_blank">Christian.Petersson@isstech.io</a></div><div>Mobil: 070-3251577</div><div><br></div></div></div>