[gpfsug-discuss] HAWC (Highly available write cache)

Dean Hildebrand dhildeb at us.ibm.com
Mon Aug 1 21:55:28 BST 2016


Hi Tejas,

Yes, most likely those 4k writes are the HAWC writes...hopefully those 4KB
writes have a lower latency than the 4k writes to your data pool so you are
realizing the benefits.

Dean




From:	Tejas Rao <raot at bnl.gov>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	08/01/2016 01:42 PM
Subject:	Re: [gpfsug-discuss] HAWC (Highly available write cache)
Sent by:	gpfsug-discuss-bounces at spectrumscale.org



I am not 100% sure what the workload of the VMs is. We have 100's of VMs
all used differently, so the workload is rather mixed.

I do see 4K writes going to "system" pool, they are tagged as "logData" in
'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it
looks like everything is not getting coalesced and these are random writes.


Could these 4k writes labelled as "logData" be the writes going to HAWC log
files?


On 8/1/2016 15:50, Dean Hildebrand wrote:


      Hi Tejas,

      Do you know the workload in the VM?

      The workload which enters into HAWC may or may not be the same as the
      workload that eventually goes into the data pool....it all depends on
      whether the 4KB writes entering HAWC can be coalesced or not. For
      example, sequential 4KB writes can all be coalesced into a single
      large chunk. So 4KB writes into HAWC will convert into 8MB writes to
      data pool (in your system). But random 4KB writes into HAWC may end
      up being 4KB writes into the data pool if there are no adjoining 4KB
      writes (i.e., if 4KB blocks are all dispersed, they can't be
      coalesced). The goal of HAWC though, whether the 4KB blocks are
      coalesced or not, is to reduce app latency by ensuring that writing
      the blocks back to the data pool is done in the background. So while
      4KB blocks may still be hitting the data pool, hopefully the
      application is seeing the latency of your presumably lower latency
      system pool.

      Dean


      Inactive
          hide details for Tejas Rao ---08/01/2016 12:06:15
      PM---In my
          case GPFS storage is used to store VM images
      (KVM) and heTejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS
      storage is used to store VM images (KVM) and hence the small IO.

      From: Tejas Rao <raot at bnl.gov>
      To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
      Date: 08/01/2016 12:06 PM
      Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache)
      Sent by: gpfsug-discuss-bounces at spectrumscale.org





      In my case GPFS storage is used to store VM images (KVM) and hence
      the small IO.

      I always see lots of small 4K writes and the GPFS filesystem block
      size is 8MB. I thought the reason for the small writes is that the
      linux kernel requests GPFS to initiate a periodic sync which by
      default is every 5 seconds and can be controlled by
      "vm.dirty_writeback_centisecs".

      I thought HAWC would help in such cases and would harden (coalesce)
      the small writes in the "system" pool and would flush to the "data"
      pool in larger block size.

      Note - I am not doing direct i/o explicitly.



      On 8/1/2016 14:49, Sven Oehme wrote:
                  when you say 'synchronous write' what do you mean by that
                  ?
                  if you are talking about using direct i/o (O_DIRECT
                  flag), they don't leverage HAWC data path, its by design.

                  sven

                  On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao <raot at bnl.gov>
                  wrote:
                        I have enabled write cache (HAWC) by running the
                        below commands. The recovery logs are supposedly
                        placed in the replicated system metadata pool
                        (SSDs). I do not have a "system.log" pool as it is
                        only needed if recovery logs are stored on the
                        client nodes.

                        mmchfs gpfs01 --write-cache-threshold 64K
                        mmchfs gpfs01 -L 1024M
                        mmchconfig logPingPongSector=no

                        I have recycled the daemon on all nodes in the
                        cluster (including the NSD nodes).

                        I still see small synchronous writes (4K) from the
                        clients going to the data drives (data pool). I am
                        checking this by looking at "mmdiag --iohist"
                        output. Should they not be going to the system
                        pool?

                        Do I need to do something else? How can I confirm
                        that HAWC is working as advertised?

                        Thanks.


                        _______________________________________________
                        gpfsug-discuss mailing list
                        gpfsug-discuss at spectrumscale.org
                        http://gpfsug.org/mailman/listinfo/gpfsug-discuss



                  _______________________________________________
                  gpfsug-discuss mailing list
                  gpfsug-discuss at spectrumscale.org
                  http://gpfsug.org/mailman/listinfo/gpfsug-discuss
      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at spectrumscale.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss





      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at spectrumscale.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160801/8ac725a4/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160801/8ac725a4/attachment-0002.gif>


More information about the gpfsug-discuss mailing list