[gpfsug-discuss] nsdMinWorkerThreads and nsdThreadsPerDisk in clusters with just a few NSDs
Txema Heredia Genestar
txema.llistes at gmail.com
Fri Jun 14 19:27:04 BST 2013
Sorry, I forgot to introduce myself.
My name is Txema Heredia. I am a systems administrator at the
Evolutionary Biology Institute in Barcelona, Spain. We are a public
research institution focused in biological and genetics research. We
have a small cluster (300-cores) and we use GPFS to feed it with a 150TB
filesystem that is currently being upgraded with ~450TB more.
We have been working with GPFS for less than a year. Our initial
installation was made by on-site IBM technicians. But we are upgrading
the system on our own, and now I am begining to understand the guts of
GPFS and all its nuances.
I looking forward to learn a lot from this discussion list.
El 14/06/13 19:57, Txema Heredia Genestar escribió:
> Hi all,
> We are building a new GPFS cluster and I have a few doubts that I hope
> you can solve about the NSD threads.
> Our old cluster is composed of 2 building blocks. Each BB is composed
> by 2 servers (12-core with 48GB RAM) connected by SAS to a dual
> controller DS3512 disk cabinet, with 36x 7.2krpm 3TB SATA disks. Each
> controller has 2Gb of cache.
> Our "big" filesystem (130 TB) is formed by 6 NSDs, each one being a
> 8+1 RAID5 LUN coming from a cabinet. Data and metadata mixed. We have
> 6 luns and 4 controllers and NSD servers. Thus, some serve 2 "disks"
> and some just 1.
> As for GPFS, we are using the default GPFS 3.4 thread parameters:
> nsdMaxWorkerThreads = 64
> nsdMinWorkerThreads = 16
> nsdThreadsPerDisk = 3
> #NSD per server = 1 or 2
> In this an IBM presentation (
> slide 4), they show that the formula to obtain the number of
> concurrently active nsd threads is:
> MAX ( MIN ( nsdThreadsPerDisk * #NSDperServer , nsdMaxWorkerThreads
> ), nsdMinWorkerThreads )
> In our case, we have only 6 NSD, and a server is responsible only of
> up-to 2 of them. We are left with MIN ( 6 , 16 ), and thus, we end up
> having between 8 and 16 threads per disk, when we should have just 3.
> This is a photo obtained right now in one of our servers with "mmfsadm
> dump nsd":
> Worker threads: running 16, started 16, desired 16, active 16, highest 16
> Requests: pending 333, highest pending 615, total processed 839099802
> Buffer use: current 16777216, highest 16777216
> Server state: suspendCount 0, killRequested 0, activeLocalIO 0
> reOpenRequested 0, reOpenInProgress 0, nsdJoinComplete 1,
> osdRequests 0x0
> Disk name NsdId Status Hold I/O rcktry wckerr Addr
> ---------- ----------------- -------- ---- --- ------ ------ ----
> home11 0A3C3D02:4FC87656 active 0 0 0 0
> scratch11 0A3C3D01:4FBE76D7 active 15 15 0 0
> scratch12 0A3C3D02:4FBE76D8 active 0 0 0 0
> scratch13 0A3C3D01:4FBE76D8 active 1 1 0 0
> On the other hand, when we run the performance monitor on each disk
> controller, we obtain the following numbers per LUN in a state of 0%
> cache hit:
> mean IO/s = 180
> Read % = 97.5%
> throughput = 105 MB/s
> All LUNs show similar results. The combined read throughput is ~630
> MB/s. This is is the "live" cluster with ~300 jobs running, not a
> single process reading a big file.
> Are all these numbers ok? Is that disk performance fine?
> What should we do with the thread parameters? Are the 16 simultaneous
> threads disrupting our disk? Should we lower the nsdMinWorkerThreads?
> Are they not? Should we rise the nsdThreadsPerDisk to 16 or more, as
> the disks have shown they can handle them?
> In our new cluster installation, we will have 4 nsd servers, each one
> being responsible of 4-5 NSDs, using 4 disk cabinets similar to the
> ones we have now. We will also move to GPFS 3.5, where
> nsdMaxWorkerThreads has been rised to 512 as default and with the
> small/large queues thing.
> How should we adapt to it? Is the nsdThreadsPerDisk=3 an ancient
> default value and we should move on?
> Thanks in advance,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gpfsug-discuss