[gpfsug-discuss] Spectrum Scale pagepool size with RDMA

Prasad Surampudi prasad.surampudi at theatsgroup.com
Thu Jul 23 01:34:02 BST 2020


We have an ESS clusters with two CES nodes. The pagepool is set to 128 GB ( Real Memory is 256 GB ) on both ESS NSD servers and CES nodes as well. Occasionally we see the mmfsd process memory usage reaches 90% on NSD servers and CES nodes and stays there until GPFS is recycled. I have couple of questions in this scenario:

  1.   What are the general recommendations of pagepool size for nodes with RDMA enabled? On, IBM knowledge center for RDMA tuning says "If the GPFS pagepool is set to 32 GB, then the mapping of the RDMA for this pagepool must be at least 64 GB."  So, does this mean that the pagepool can't be more than half of real memory with RDMA enabled? Also, Is this the reason why mmfsd memory usage exceeds pagepool size and spikes to almost 90%?
  2.  If we dont want to see high mmfsd process memory usage on NSD/CES nodes, should we decrease the pagepool size?
  3.  Can we tune  log_num_mtt parameter to limit the memory usage? Currently its set to 0 for both NSD (ppc64_le) and CES (x86_64).
  4.  We also see messages like "Verbs RDMA disabled for xx.xx.xx.xx due to no matching port found" . Any idea what this message indicate? I dont see any Verbs RDMA enabled message after these warning messages. Does it get enabled automatically?

Prasad Surampudi

