<font size=2 face="sans-serif">Simon,</font><br><br><tt><font size=2>>Question 1.<br>>Can we force the gateway node for the other file-sets to our "02"

node.<br>>I.e. So that we can get the queue services for the other filesets.<br></font></tt><br><tt><font size=2>AFM automatically maps the fileset to gateway node,

and today there is no option available for users to assign fileset to a

particular gateway node. This feature will be supported in future releases.</tt> <tt><font size=2> >Question 2. >How can we make AFM actually work for the "facility" file-set.

If we shut<br>>down GPFS on the node, on the secondary node, we'll see log entires

like:<br>>2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove<br>>operations...<br>>So I'm assuming the massive queue is all file remove operations?<br></font></tt><br><tt><font size=2>These are the files which were created in cache, and

were deleted before they get replicated to home. AFM recovery will delete

them locally. Yes, it is possible that most of these operations are local

remove operations.Try finding those operations using dump command.</font></tt><br><br><tt><font size=2> mmfsadm saferdump afm all | grep 'Remove\|Rmdir'

| grep local | wc -l</font></tt><br><br><tt><font size=2><br>>Alarmingly, we are also seeing entires like:<br>>2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache<br>>fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote<br>>error 5<br></font></tt><br><tt><font size=2>Traces are needed to verify IO errors. Also try disabling

the parallel IO and see if replication speed improves.</font></tt><br><br><font size=2 face="sans-serif">mmchfileset device fileset -p afmParallelWriteThreshold=disable</font><br><br><font size=2 face="sans-serif">~Venkat (vpuvvada@in.ibm.com)</font><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From:      

 </font><font size=1 face="sans-serif">"Simon Thompson

(IT Research Support)" <S.J.Thompson@bham.ac.uk></font><br><font size=1 color=#5f5f5f face="sans-serif">To:      

 </font><font size=1 face="sans-serif">"gpfsug-discuss@spectrumscale.org"

<gpfsug-discuss@spectrumscale.org></font><br><font size=1 color=#5f5f5f face="sans-serif">Date:      

 </font><font size=1 face="sans-serif">10/09/2017 06:27 PM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject:    

   </font><font size=1 face="sans-serif">[gpfsug-discuss]

AFM fun (more!)</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by:    

   </font><font size=1 face="sans-serif">gpfsug-discuss-bounces@spectrumscale.org</font><br><hr noshade><br><br><br><tt><font size=2><br>Hi All,<br><br>We're having fun (ok not fun ...) with AFM.<br><br>We have a file-set where the queue length isn't shortening, watching it<br>over 5 sec periods, the queue length increases by ~600-1000 items, and

the<br>numExec goes up by about 15k.<br><br>The queues are steadily rising and we've seen them over 1000000 ...<br><br>This is on one particular fileset e.g.:<br><br>mmafmctl rds-cache getstate<br>                    

  Mon Oct  9 08:43:58 2017<br><br>Fileset Name    Fileset Target          

                     Cache

State<br>        Gateway Node    Queue Length  

Queue numExec<br>------------    --------------<br>-------------        ------------    ------------

  -------------<br>rds-projects-facility gpfs:///rds/projects/facility      

    Dirty<br>        bber-afmgw01    3068953    

   520504<br>rds-projects-2015 gpfs:///rds/projects/2015        

          Active<br>        bber-afmgw01    0      

       3<br>rds-projects-2016 gpfs:///rds/projects/2016        

          Dirty<br>        bber-afmgw01    1482    

      70<br>rds-projects-2017 gpfs:///rds/projects/2017        

          Dirty<br>        bber-afmgw01    713    

       9104<br>bear-apps                

gpfs:///rds/bear-apps              

          Dirty<br>  bber-afmgw02    3            

 2472770871<br>user-homes                

gpfs:///rds/homes                

            Active<br>   bber-afmgw02    0          

   19<br>bear-sysapps    gpfs:///rds/bear-sysapps      

               Active<br>        bber-afmgw02    0      

       4<br><br><br><br>This is having the effect that other filesets on the same "Gateway"

are<br>not getting their queues processed.<br><br>Question 1.<br>Can we force the gateway node for the other file-sets to our "02"

node. I.e. So that we can get the queue services for the other filesets. Question 2. How can we make AFM actually work for the "facility" file-set.

If we shut<br>down GPFS on the node, on the secondary node, we'll see log entires like:<br>2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove<br>operations...<br><br>So I'm assuming the massive queue is all file remove operations?<br><br>Alarmingly, we are also seeing entires like:<br>2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache<br>fileset rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote<br>error 5<br><br>Anyone any suggestions?<br><br>Thanks<br><br>Simon<br><br><br>_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at spectrumscale.org<br></font></tt><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=_THXlsTtzTaQQnCD5iwucKoQnoVZmXwtZksU6YDO5O8&s=LlIrCk36ptPJs1Oix2ekZdUAMcH7ZE7GRlKzRK1_NPI&e="><tt><font size=2>https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=_THXlsTtzTaQQnCD5iwucKoQnoVZmXwtZksU6YDO5O8&s=LlIrCk36ptPJs1Oix2ekZdUAMcH7ZE7GRlKzRK1_NPI&e=</font></tt></a><tt><font size=2><br><br></font></tt><br><br><BR>