[gpfsug-discuss] Backend corruption

Uwe Falke UWEFALKE at de.ibm.com
Tue Aug 4 10:31:29 BST 2020


Hi Stef, 

> So the policy is not finding any files but there is still some data on 
> the V50003 pool?
So it looks to me. You attempted to empty the pool before, didn't you? 
Maybe something has been confused that way internally, or the policy finds 
only readable files and the corrupted ones have an internal flag to be 
unreadable ...  If you know faulty (but occupied per current metadata) 
disk addresses, you could use mmfileid to find the inode which should have 
used that block.
But that's all just guesswork. I think someone who knows exactly what 
Scale does in such situations (restriping from faulty storage) should be 
able to tell what's up in your system. If you don't find an answer here 
I'd suggest you open a case with IBM support.

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Global Technology Services / Project Services Delivery / High Performance 
Computing
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefalke at de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Dr. Thomas Wolter, Sven Schooss
Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122



From:   Stef Coene <stef.coene at docum.org>
To:     gpfsug-discuss at spectrumscale.org
Date:   04/08/2020 08:41
Subject:        [EXTERNAL] Re: [gpfsug-discuss] Backend corruption
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hi,

I tried to use a policy to find out what files are located on the broken 
disks.
But this is not finding any files or directories (I cleaned some of the 
output):

[I] GPFS Current Data Pool Utilization in KB and %
Pool_Name                   KB_Occupied        KB_Total  Percent_Occupied
V500003                       173121536     69877104640      0.247751444%

[I] 29609813 of 198522880 inodes used: 14.915063%.

[I] Loaded policy rules from test.rule.
rule 'ListRule'
    list 'ListName'
       from pool 'V500003'

[I] Directories scan: 28649029 files, 960844 directories, 0 other 
objects, 0 'skipped' files and/or errors.

[I] Inodes scan: 28649029 files, 960844 directories, 0 other objects, 0 
'skipped' files and/or errors.

[I] Summary of Rule Applicability and File Choices:
  Rule#      Hit_Cnt          KB_Hit          Chosen       KB_Chosen 
      KB_Ill     Rule
      0            0               0               0               0 
           0     RULE 'ListRule' LIST 'ListName' FROM POOL 'V500003'

[I] Filesystem objects with no applicable rules: 29609873.

[I] A total of 0 files have been migrated, deleted or processed by an 
EXTERNAL EXEC/script;
         0 'skipped' files and/or errors.


So the policy is not finding any files but there is still some data on 
the V50003 pool?


Stef

On 2020-08-03 17:21, Uwe Falke wrote:
> Hi, Stef,
> 
> if just that V5000 has provided the storage for one of your pools
> entirely, and if your metadata are still incorrupted, a inode scan with 
a
> suited policy should yield the list of files on that pool.
> If I am not mistaken, the list policy could look like
> 
> RULE 'list_v5000'  LIST 'v5000_filelist'  FROM POOL <your_v5000_pool>
> 
> paut it into a (policy) file,  run that by mmapplypolicy against the 
file
> system in question, it should produce a file listing in
> /tmp/v5000_filelist. If it doesn#T work exactly like that (I might have
> made one or mor mistakes), check out the information lifycacle section 
in
> the scal admin guide.
> 
> If the prereqs for the above are not met, you need to run more expensive
> investigations (using tsdbfs for all block addresses on v5000-provided
> NSDs).
> 
> Mit freundlichen Grüßen / Kind regards
> 
> Dr. Uwe Falke
> IT Specialist
> Global Technology Services / Project Services Delivery / High 
Performance
> Computing
> +49 175 575 2877 Mobile
> Rathausstr. 7, 09111 Chemnitz, Germany
> uwefalke at de.ibm.com
> 
> IBM Services
> 
> IBM Data Privacy Statement
> 
> IBM Deutschland Business & Technology Services GmbH
> Geschäftsführung: Dr. Thomas Wolter, Sven Schooss
> Sitz der Gesellschaft: Ehningen
> Registergericht: Amtsgericht Stuttgart, HRB 17122
> 
> 
> 
> From:   Stef Coene <stef.coene at docum.org>
> To:     gpfsug-discuss at spectrumscale.org
> Date:   03/08/2020 16:07
> Subject:        [EXTERNAL] [gpfsug-discuss] Backend corruption
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> 
> 
> 
> Hi,
> 
> We have a GPFS file system which uses, among other storage, a V5000 as
> backend.
> There was an error in the fire detection alarm in the datacenter and a
> fire alarm was triggered.
> The result was that the V5000 had a lot of broken disks. Most of the
> disks recovered fine after a reseat, but some data is corrupted on the
> V5000.
> 
> This means that for 22MB of data, the V5000 returns a read error to the
> GPFS.
> 
> We migrated most of the data to an disks but there is still 165 GB left
> on the V5000 pool.
> 
> When we try to remove the disks with mmdeldisk, it fails after a while
> and places some of the disks as down.
> It generated a file with inodes, this an example of a 2 lines:
>    9168519      0:0        0           1                 1
>     exposed illreplicated illplaced REGULAR_FILE Error: 218 Input/output
> error
>    9251611      0:0        0           1                 1
>     exposed illreplicated REGULAR_FILE Error: 218 Input/output error
> 
> 
> How can I get a list of files that uses data of the V5000 pool?
> The data is written by CommVault. When I have a list of files, I can
> determine the impact on the application.
> 
> 
> Stef
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=HhbxQEWLNTXFDCFT5LDpMD4YvYTUEdl6Nt6IgjdVlNo&s=fxsoDddp4OUnP7gORNUOnAmrnHPIU57OQMnraXEEO0k&e=

> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=i7ODb4dy2VmFYbY7bAt3ZQm1nei0XrC8DFSkR50RDKA&s=upGSItHNs6Ahvct2PeM9vWdz8JfyaChlmvd3dzR4KWI&e= 

> 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=i7ODb4dy2VmFYbY7bAt3ZQm1nei0XrC8DFSkR50RDKA&s=upGSItHNs6Ahvct2PeM9vWdz8JfyaChlmvd3dzR4KWI&e= 









More information about the gpfsug-discuss mailing list