[gpfsug-discuss] AFM fun (more!)

Venkateswara R Puvvada vpuvvada at in.ibm.com
Tue Oct 10 05:56:21 BST 2017


Simon,

>Question 1.
>Can we force the gateway node for the other file-sets to our "02" node.
>I.e. So that we can get the queue services for the other filesets.

AFM automatically maps the fileset to gateway node, and today there is no 
option available for users to assign fileset to a particular gateway node. 
This feature will be supported in future releases.

>Question 2.
>How can we make AFM actually work for the "facility" file-set. If we shut
>down GPFS on the node, on the secondary node, we'll see log entires like:
>2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove
>operations...
>So I'm assuming the massive queue is all file remove operations?

These are the files which were created in cache, and were deleted before 
they get replicated to home. AFM recovery will delete them locally. Yes, 
it is possible that most of these operations are local remove 
operations.Try finding those operations using dump command.

 mmfsadm saferdump afm all | grep 'Remove\|Rmdir' | grep local | wc -l


>Alarmingly, we are also seeing entires like:
>2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache
>fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote
>error 5

Traces are needed to verify IO errors. Also try disabling the parallel IO 
and see if replication speed improves.

mmchfileset device fileset -p afmParallelWriteThreshold=disable

~Venkat (vpuvvada at in.ibm.com)



From:   "Simon Thompson (IT Research Support)" <S.J.Thompson at bham.ac.uk>
To:     "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
Date:   10/09/2017 06:27 PM
Subject:        [gpfsug-discuss] AFM fun (more!)
Sent by:        gpfsug-discuss-bounces at spectrumscale.org




Hi All,

We're having fun (ok not fun ...) with AFM.

We have a file-set where the queue length isn't shortening, watching it
over 5 sec periods, the queue length increases by ~600-1000 items, and the
numExec goes up by about 15k.

The queues are steadily rising and we've seen them over 1000000 ...

This is on one particular fileset e.g.:

mmafmctl rds-cache getstate
                       Mon Oct  9 08:43:58 2017

Fileset Name    Fileset Target                                Cache State
        Gateway Node    Queue Length   Queue numExec
------------    --------------
-------------        ------------    ------------   -------------
rds-projects-facility gpfs:///rds/projects/facility           Dirty
        bber-afmgw01    3068953        520504
rds-projects-2015 gpfs:///rds/projects/2015                   Active
        bber-afmgw01    0              3
rds-projects-2016 gpfs:///rds/projects/2016                   Dirty
        bber-afmgw01    1482           70
rds-projects-2017 gpfs:///rds/projects/2017                   Dirty
        bber-afmgw01    713            9104
bear-apps                gpfs:///rds/bear-apps Dirty
  bber-afmgw02    3              2472770871
user-homes               gpfs:///rds/homes Active
   bber-afmgw02    0              19
bear-sysapps    gpfs:///rds/bear-sysapps                      Active
        bber-afmgw02    0              4



This is having the effect that other filesets on the same "Gateway" are
not getting their queues processed.

Question 1.
Can we force the gateway node for the other file-sets to our "02" node.
I.e. So that we can get the queue services for the other filesets.

Question 2.
How can we make AFM actually work for the "facility" file-set. If we shut
down GPFS on the node, on the secondary node, we'll see log entires like:
2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove
operations...

So I'm assuming the massive queue is all file remove operations?

Alarmingly, we are also seeing entires like:
2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache
fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote
error 5

Anyone any suggestions?

Thanks

Simon


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=_THXlsTtzTaQQnCD5iwucKoQnoVZmXwtZksU6YDO5O8&s=LlIrCk36ptPJs1Oix2ekZdUAMcH7ZE7GRlKzRK1_NPI&e=






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171010/99f788e5/attachment-0002.htm>


More information about the gpfsug-discuss mailing list