[gpfsug-discuss] gpfs performance monitoring

Salvatore Di Nardo sdinardo at ebi.ac.uk
Wed Sep 3 18:27:44 BST 2014


Hello everybody,
here i come here again, this time to ask some hint about how to monitor 
GPFS.

I know about mmpmon, but the issue with its "fs_io_s" and "io_s" is that 
they return number based only on the request done in the current host, 
so i have to run them on all the clients ( over 600 nodes) so its quite 
unpractical.  Instead i would like to know from the servers whats going 
on, and i came across the vio_s statistics wich are less documented and 
i dont know exacly what they mean. There is also this script 
"/usr/lpp/mmfs/samples/vdisk/viostat" that runs VIO_S.

My problems with the output of this command:

          echo "vio_s" | /usr/lpp/mmfs/bin/mmpmon -r 1

        mmpmon> mmpmon node 10.7.28.2 name gss01a vio_s OK VIOPS per second
        timestamp: 1409763206/477366
        recovery group:                     *
        declustered array:                  *
        vdisk:                              *
        client reads: 2584229
        client short writes: 55299693
        client medium writes: 190071
        client promoted full track writes: 465145
        client full track writes: 9249
        flushed update writes: 4187708
        flushed promoted full track writes: 123
        migrate operations: 114
        scrub operations: 450590
        log writes: 28509602


it sais "VIOPS per second", but they seem to me just counters as every 
time i re-run the command, the numbers increase by a bit..
Can anyone confirm if those numbers are counter or if they are OPS/sec.

On a closer eye about i dont understand what most of thosevalues mean. 
For example, what exacly are "flushed promoted full track write" ??
I tried to find a documentation about this output , but could not find 
any. can anyone point me a link where output of vio_s is explained?

Another thing i dont understand about those numbers is if they are just 
operations, or the number of blocks that was read/write/etc . I'm asking 
that because if they are just ops, i don't know how much they could be 
usefull. For example one write operation could eman write 1 block or 
write a file of 100GB. If those are oprations, there is a way to have 
the oupunt in bytes or blocks?

Last but not least.. and this is what i really would like to accomplish, 
i would to be able to monitor the latency of metadata operations.
In my environment there are users that litterally overhelm our storages 
with metadata request, so even if there is no massive throughput or huge 
waiters, any "ls" could take ages. I would like to be able to monitor 
metadata behaviour. There is a way to to do that from the NSD servers?

Thanks in advance for any tip/help.

Regards,
Salvatore
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140903/d96e6643/attachment.htm>


More information about the gpfsug-discuss mailing list