[gpfsug-discuss] gpfs performance monitoring

Jonathan Buzzard jonathan at buzzard.me.uk
Fri Sep 5 10:29:27 BST 2014


On Thu, 2014-09-04 at 11:43 +0100, Salvatore Di Nardo wrote:

[SNIP]

> 
> Sometimes, it also happens that there is very low IO (10Gb/s ), almost
> no cpu usage on the servers but huge slownes ( ls can take 10
> seconds).  Why that happens? There is not much data ops , but we think
> there is a huge ammount of metadata ops. So what i want to know is if
> the metadata vdisks are busy or not. If this is our problem, could
> some SSD disks dedicated to metadata help? 
> 

This is almost always because you are using an external LDAP/NIS server
for GECOS information and the values that you need are not cached for
whatever reason and you are having to look them up again. Note that the
standard aliasing for RHEL based distros of ls also causes it to do a
stat on every file for the colouring etc. Also be aware that if you are
trying to fill out your cd with TAB auto-completion you will run into
similar issues. That is had you typed the path for the cd out in full
you would get in instantly, doing a couple of letters and hitting cd it
could take a while.

You can test this on a RHEL based distro by doing "/bin/ls -n" The idea
being to avoid any aliasing and not look up GECOS data and just report
the raw numerical stuff.

What I would suggest is that you set the cache time on UID/GID lookups
for positive lookups to a long time, in general as long as possible
because the values should almost never change. Even for a positive look
up of a group membership I would have that cached for a couple of hours.
For negative lookups something like five or 10 minutes is a good
starting point.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.





More information about the gpfsug-discuss mailing list