[gpfsug-discuss] AFM fun (more!)

Simon Thompson (IT Research Support) S.J.Thompson at bham.ac.uk
Tue Oct 10 17:03:35 BST 2017


So as you might expect, we've been poking at this all day.

We'd typically get to ~1000 entries in the queue having taken access to the FS away from users (yeah its that bad), but the remaining items would stay for ever as far as we could see. By copying the file, removing and then moving the copied file, we're able to get it back into a clean state. But then we ran a sample user job, and instantly the next job hung up the queue (we're talking like <100MB files here).

Interestingly we looked at the queue to see what was going on (with saferdump, always use saferdump!!!)

  Normal Queue:  (listed by execution order) (state: Active)
      95 Write [6060026.6060026] inflight (18 @ 0) thread_id 44812
      96 Write [13808655.13808655] queued (18 @ 0)
      97 Truncate [6060026] queued
      98 Truncate [13808655] queued
     124 Write [6060000.6060000] inflight (18 @ 0) thread_id 44835
     125 Truncate [6060000] queued
     159 Write [6060013.6060013] inflight (18 @ 0) thread_id 21329
     160 Truncate [6060013] queued
     171 Write [5953611.5953611] inflight (18 @ 0) thread_id 44837
     172 Truncate [5953611] queued

Note that each inode that is inflight is followed by a queued Truncate... We are running efix2, because there is an issue with truncate not working prior to this (it doesn't get sent to home), so this smells like an AFM bug to me.

We have a PMR open...

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "Leo Earl (ITCS - Staff)" <Leo.Earl at uea.ac.uk<mailto:Leo.Earl at uea.ac.uk>>
Reply-To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, 10 October 2017 at 16:29
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] AFM fun (more!)

Hi Simon,

(My first ever post – queue being shot down in flames)

Whilst this doesn’t answer any of your questions (directly)

One thing we do tend to look at when we see (what appears) to be a static “Queue Length” value, is the data which is actually inflight from an AFM perspective, so that we can ascertain whether the reason is something like, a user writing huge files to the cache, which take time to sync with home, and thus remain in the queue, providing a static “Queue Length”

[root at server ~]# mmafmctl gpfs getstate | awk ' $6 >=50 '
Fileset Name    Fileset Target                                     Fileset State  Gateway Node    Queue State    Queue Length   Queue numExec
afmcancergenetics <IP here>:/leo/res_hpc/afm-leo Dirty           csgpfs01                Active                60                        126157822
[root at server ~]#

So for instance, by using the tsfindinode command, to have a look at the size of the file which is currently “inflight” from an AFM perspective:

[root at server ~]#  mmfsadm dump afm | more


  Fileset: afm-leo 12 (gpfs)
  mode: independent-writer queue: Normal myFileset  MDS: <c0n334>
  home: <IP> proto: nfs port: 2049   lastCmd: 6
  handler: Mounted Dirty     refCount: 5
  queue: delay 300 QLen 0+9 flushThds 4 maxFlushThds 4 numExec 436 qfs 0 err 0
  i/o: readBuf: 33554432 writeBuf: 2097152 sparseReadThresh: 134217728  pReadThreads 1
  i/o: pReadGWs 0 (All) pReadChunkSize 134217728 pReadThresh: -2 >> Disabled <<
  i/o: prefetchThresh 0 (Prefetch)
  iw: afmIwTakeoverTime 0
Priority Queue:  (listed by execution order) (state: Active)
    Write [601414711.601379492] inflight  (377743888481 @ 0) chunks 0 bytes 0 0 thread_id 7630
    Write [601717612.601465868] inflight  (462997479227 @ 0) chunks 0 bytes 0 1 thread_id 10200
    Write [601717612.601465870] inflight  (391663667550 @ 0) chunks 0 bytes 0 2 thread_id 10287
    Write [601717612.601465871] inflight  (377743888481 @ 0) chunks 0 bytes 0 3 thread_id 10333
    Write [601717612.601573418] queued  (387002794104 @ 0) chunks 0 bytes 0 4
    Write [601414711.601650195] queued  (342305480538 @ 0) chunks 0 bytes 0 5
    ResetDirty [538455296.-1] queued etype normal normal  19061213
    ResetDirty [601623334.-1] queued etype normal normal  19061213
    RecoveryMarker [-1.-1] queued etype normal normal  19061213
  Normal Queue:  Empty (state: Active)
Fileset: afmdata 20 (gpfs)



#Use the file inode ID to determine the actual file which is inflight between cache and home

[root at server cancergenetics]# tsfindinode  -i 601379492 /gpfs/afm/cancergenetics > inode.out

[root at server ~]# cat /root/inode.out
601379492               0      0xCCB6  /gpfs/afm/cancergenetics/Claudia/fastq/PD7446i.fastq

[root at server ~]# ls -l /gpfs/afm/cancergenetics/Claudia/fastq/PD7446i.fastq
-rw-r--r-- 1 bbn16cku MED_pg 377743888481 Mar 25 05:48 /gpfs/afm/cancergenetics/Claudia/fastq/PD7446i.fastq
[root at server

I am not sure if that helps and you probably already know about it inflight checking…


Kind Regards,

Leo

Leo Earl | Head of Research & Specialist Computing
Room ITCS 01.16, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ
+44 (0) 1603 593856


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: 10 October 2017 05:56
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] AFM fun (more!)

Simon,

>Question 1.
>Can we force the gateway node for the other file-sets to our "02" node.
>I.e. So that we can get the queue services for the other filesets.

AFM automatically maps the fileset to gateway node, and today there is no option available for users to assign fileset to a particular gateway node. This feature will be supported in future releases.

>Question 2.
>How can we make AFM actually work for the "facility" file-set. If we shut
>down GPFS on the node, on the secondary node, we'll see log entires like:
>2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove
>operations...
>So I'm assuming the massive queue is all file remove operations?

These are the files which were created in cache, and were deleted before they get replicated to home. AFM recovery will delete them locally. Yes, it is possible that most of these operations are local remove operations.Try finding those operations using dump command.

 mmfsadm saferdump afm all | grep 'Remove\|Rmdir' | grep local | wc -l


>Alarmingly, we are also seeing entires like:
>2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache
>fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote
>error 5

Traces are needed to verify IO errors. Also try disabling the parallel IO and see if replication speed improves.

mmchfileset device fileset -p afmParallelWriteThreshold=disable

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)



From:        "Simon Thompson (IT Research Support)" <S.J.Thompson at bham.ac.uk<mailto:S.J.Thompson at bham.ac.uk>>
To:        "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        10/09/2017 06:27 PM
Subject:        [gpfsug-discuss] AFM fun (more!)
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________




Hi All,

We're having fun (ok not fun ...) with AFM.

We have a file-set where the queue length isn't shortening, watching it
over 5 sec periods, the queue length increases by ~600-1000 items, and the
numExec goes up by about 15k.

The queues are steadily rising and we've seen them over 1000000 ...

This is on one particular fileset e.g.:

mmafmctl rds-cache getstate
                      Mon Oct  9 08:43:58 2017

Fileset Name    Fileset Target                                Cache State
       Gateway Node    Queue Length   Queue numExec
------------    --------------
-------------        ------------    ------------   -------------
rds-projects-facility gpfs:///rds/projects/facility           Dirty
       bber-afmgw01    3068953        520504
rds-projects-2015 gpfs:///rds/projects/2015                   Active
       bber-afmgw01    0              3
rds-projects-2016 gpfs:///rds/projects/2016                   Dirty
       bber-afmgw01    1482           70
rds-projects-2017 gpfs:///rds/projects/2017                   Dirty
       bber-afmgw01    713            9104
bear-apps                 gpfs:///rds/bear-apps                         Dirty
 bber-afmgw02    3              2472770871
user-homes                 gpfs:///rds/homes                             Active
  bber-afmgw02    0              19
bear-sysapps    gpfs:///rds/bear-sysapps                      Active
       bber-afmgw02    0              4



This is having the effect that other filesets on the same "Gateway" are
not getting their queues processed.

Question 1.
Can we force the gateway node for the other file-sets to our "02" node.
I.e. So that we can get the queue services for the other filesets.

Question 2.
How can we make AFM actually work for the "facility" file-set. If we shut
down GPFS on the node, on the secondary node, we'll see log entires like:
2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove
operations...

So I'm assuming the massive queue is all file remove operations?

Alarmingly, we are also seeing entires like:
2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache
fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote
error 5

Anyone any suggestions?

Thanks

Simon


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=_THXlsTtzTaQQnCD5iwucKoQnoVZmXwtZksU6YDO5O8&s=LlIrCk36ptPJs1Oix2ekZdUAMcH7ZE7GRlKzRK1_NPI&e=



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171010/db2b358e/attachment-0002.htm>


More information about the gpfsug-discuss mailing list