[gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem?

Sven Oehme oehmes at gmail.com
Wed Aug 1 19:41:05 BST 2018


the number of subblocks is derived by the smallest blocksize in any pool of
a given filesystem. so if you pick a metadata blocksize of 1M it will be 8k
in the metadata pool, but 4 x of that in the data pool if your data pool is
4M.

sven


On Wed, Aug 1, 2018 at 11:21 AM Felipe Knop <knop at us.ibm.com> wrote:

> Marc, Kevin,
>
> We'll be looking into this issue, since at least at a first glance, it
> does look odd. A 4MB block size should have resulted in an 8KB subblock
> size. I suspect that, somehow, the *--metadata-block-size** 1M* may have
> resulted in
>
>
> 32768 Minimum fragment (subblock) size in bytes (other pools)
>
> but I do not yet understand how.
>
> The *subblocks-per-full-block* parameter is not supported with *mmcrfs *.
>
> Felipe
>
> ----
> Felipe Knop knop at us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314 T/L 293-9314
>
>
>
> [image: graycol.gif]"Marc A Kaplan" ---08/01/2018 01:21:23 PM---I haven't
> looked into all the details but here's a clue -- notice there is only one
> "subblocks-per-
>
> From: "Marc A Kaplan" <makaplan at us.ibm.com>
>
>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>
> Date: 08/01/2018 01:21 PM
> Subject: Re: [gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem?
>
>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> I haven't looked into all the details but here's a clue -- notice there is
> only one "subblocks-per-full-block" parameter.
>
> And it is the same for both metadata blocks and datadata blocks.
>
> So maybe (MAYBE) that is a constraint somewhere...
>
> Certainly, in the currently supported code, that's what you get.
>
>
>
>
> From: "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date: 08/01/2018 12:55 PM
> Subject: [gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem?
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Our production cluster is still on GPFS 4.2.3.x, but in preparation for
> moving to GPFS 5 I have upgraded our small (7 node) test cluster to GPFS
> 5.0.1-1. I am setting up a new filesystem there using hardware that we
> recently life-cycled out of our production environment.
>
> I “successfully” created a filesystem but I believe the sub-block size is
> wrong. I’m using a 4 MB filesystem block size, so according to the mmcrfs
> man page the sub-block size should be 8K:
>
> Table 1. Block sizes and subblock sizes
>
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
> | Block size | Subblock size |
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
> | 64 KiB | 2 KiB |
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
> | 128 KiB | 4 KiB |
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
> | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB |
> | MiB, 4 MiB | |
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
> | 8 MiB, 16 MiB | 16 KiB |
> +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+
>
> However, it appears that it’s 8K for the system pool but 32K for the other
> pools:
>
> flag value description
> ------------------- ------------------------
> -----------------------------------
> -f 8192 Minimum fragment (subblock) size in bytes (system pool)
> 32768 Minimum fragment (subblock) size in bytes (other pools)
> -i 4096 Inode size in bytes
> -I 32768 Indirect block size in bytes
> -m 2 Default number of metadata replicas
> -M 3 Maximum number of metadata replicas
> -r 1 Default number of data replicas
> -R 3 Maximum number of data replicas
> -j scatter Block allocation type
> -D nfs4 File locking semantics in effect
> -k all ACL semantics in effect
> -n 32 Estimated number of nodes that will mount file system
> -B 1048576 Block size (system pool)
> 4194304 Block size (other pools)
> -Q user;group;fileset Quotas accounting enabled
> user;group;fileset Quotas enforced
> none Default quotas enabled
> --perfileset-quota No Per-fileset quota enforcement
> --filesetdf No Fileset df enabled?
> -V 19.01 (5.0.1.0) File system version
> --create-time Wed Aug 1 11:39:39 2018 File system creation time
> -z No Is DMAPI enabled?
> -L 33554432 Logfile size
> -E Yes Exact mtime mount option
> -S relatime Suppress atime mount option
> -K whenpossible Strict replica allocation option
> --fastea Yes Fast external attributes enabled?
> --encryption No Encryption enabled?
> --inode-limit 101095424 Maximum number of inodes
> --log-replicas 0 Number of log replicas
> --is4KAligned Yes is4KAligned?
> --rapid-repair Yes rapidRepair enabled?
> --write-cache-threshold 0 HAWC Threshold (max 65536)
> --subblocks-per-full-block 128 Number of subblocks per full block
> -P system;raid1;raid6 Disk storage pools in file system
> --file-audit-log No File Audit Logging enabled?
> --maintenance-mode No Maintenance Mode enabled?
> -d
> test21A3nsd;test21A4nsd;test21B3nsd;test21B4nsd;test23Ansd;test23Bnsd;test23Cnsd;test24Ansd;test24Bnsd;test24Cnsd;test25Ansd;test25Bnsd;test25Cnsd
> Disks in file system
> -A yes Automatic mount option
> -o none Additional mount options
> -T /gpfs5 Default mount point
> --mount-priority 0 Mount priority
>
> Output of mmcrfs:
>
> mmcrfs gpfs5 -F ~/gpfs/gpfs5.stanza -A yes -B 4M -E yes -i 4096 -j scatter
> -k all -K whenpossible -m 2 -M 3 -n 32 -Q yes -r 1 -R 3 -T /gpfs5 -v yes
> --nofilesetdf --metadata-block-size 1M
>
> The following disks of gpfs5 will be formatted on node testnsd3:
> test21A3nsd: size 953609 MB
> test21A4nsd: size 953609 MB
> test21B3nsd: size 953609 MB
> test21B4nsd: size 953609 MB
> test23Ansd: size 15259744 MB
> test23Bnsd: size 15259744 MB
> test23Cnsd: size 1907468 MB
> test24Ansd: size 15259744 MB
> test24Bnsd: size 15259744 MB
> test24Cnsd: size 1907468 MB
> test25Ansd: size 15259744 MB
> test25Bnsd: size 15259744 MB
> test25Cnsd: size 1907468 MB
> Formatting file system ...
> Disks up to size 8.29 TB can be added to storage pool system.
> Disks up to size 16.60 TB can be added to storage pool raid1.
> Disks up to size 132.62 TB can be added to storage pool raid6.
> Creating Inode File
> 8 % complete on Wed Aug 1 11:39:19 2018
> 18 % complete on Wed Aug 1 11:39:24 2018
> 27 % complete on Wed Aug 1 11:39:29 2018
> 37 % complete on Wed Aug 1 11:39:34 2018
> 48 % complete on Wed Aug 1 11:39:39 2018
> 60 % complete on Wed Aug 1 11:39:44 2018
> 72 % complete on Wed Aug 1 11:39:49 2018
> 83 % complete on Wed Aug 1 11:39:54 2018
> 95 % complete on Wed Aug 1 11:39:59 2018
> 100 % complete on Wed Aug 1 11:40:01 2018
> Creating Allocation Maps
> Creating Log Files
> 3 % complete on Wed Aug 1 11:40:07 2018
> 28 % complete on Wed Aug 1 11:40:14 2018
> 53 % complete on Wed Aug 1 11:40:19 2018
> 78 % complete on Wed Aug 1 11:40:24 2018
> 100 % complete on Wed Aug 1 11:40:25 2018
> Clearing Inode Allocation Map
> Clearing Block Allocation Map
> Formatting Allocation Map for storage pool system
> 85 % complete on Wed Aug 1 11:40:32 2018
> 100 % complete on Wed Aug 1 11:40:33 2018
> Formatting Allocation Map for storage pool raid1
> 53 % complete on Wed Aug 1 11:40:38 2018
> 100 % complete on Wed Aug 1 11:40:42 2018
> Formatting Allocation Map for storage pool raid6
> 20 % complete on Wed Aug 1 11:40:47 2018
> 39 % complete on Wed Aug 1 11:40:52 2018
> 60 % complete on Wed Aug 1 11:40:57 2018
> 79 % complete on Wed Aug 1 11:41:02 2018
> 100 % complete on Wed Aug 1 11:41:08 2018
> Completed creation of file system /dev/gpfs5.
> mmcrfs: Propagating the cluster configuration data to all
> affected nodes. This is an asynchronous process.
>
> And contents of stanza file:
>
> %nsd:
> nsd=test21A3nsd
> usage=metadataOnly
> failureGroup=210
> pool=system
> servers=testnsd3,testnsd1,testnsd2
> device=dm-15
>
> %nsd:
> nsd=test21A4nsd
> usage=metadataOnly
> failureGroup=210
> pool=system
> servers=testnsd1,testnsd2,testnsd3
> device=dm-14
>
> %nsd:
> nsd=test21B3nsd
> usage=metadataOnly
> failureGroup=211
> pool=system
> servers=testnsd1,testnsd2,testnsd3
> device=dm-17
>
> %nsd:
> nsd=test21B4nsd
> usage=metadataOnly
> failureGroup=211
> pool=system
> servers=testnsd2,testnsd3,testnsd1
> device=dm-16
>
> %nsd:
> nsd=test23Ansd
> usage=dataOnly
> failureGroup=23
> pool=raid6
> servers=testnsd2,testnsd3,testnsd1
> device=dm-10
>
> %nsd:
> nsd=test23Bnsd
> usage=dataOnly
> failureGroup=23
> pool=raid6
> servers=testnsd3,testnsd1,testnsd2
> device=dm-9
>
> %nsd:
> nsd=test23Cnsd
> usage=dataOnly
> failureGroup=23
> pool=raid1
> servers=testnsd1,testnsd2,testnsd3
> device=dm-5
>
> %nsd:
> nsd=test24Ansd
> usage=dataOnly
> failureGroup=24
> pool=raid6
> servers=testnsd3,testnsd1,testnsd2
> device=dm-6
>
> %nsd:
> nsd=test24Bnsd
> usage=dataOnly
> failureGroup=24
> pool=raid6
> servers=testnsd1,testnsd2,testnsd3
> device=dm-0
>
> %nsd:
> nsd=test24Cnsd
> usage=dataOnly
> failureGroup=24
> pool=raid1
> servers=testnsd2,testnsd3,testnsd1
> device=dm-2
>
> %nsd:
> nsd=test25Ansd
> usage=dataOnly
> failureGroup=25
> pool=raid6
> servers=testnsd1,testnsd2,testnsd3
> device=dm-6
>
> %nsd:
> nsd=test25Bnsd
> usage=dataOnly
> failureGroup=25
> pool=raid6
> servers=testnsd2,testnsd3,testnsd1
> device=dm-6
>
> %nsd:
> nsd=test25Cnsd
> usage=dataOnly
> failureGroup=25
> pool=raid1
> servers=testnsd3,testnsd1,testnsd2
> device=dm-3
>
> %pool:
> pool=system
> blockSize=1M
> usage=metadataOnly
> layoutMap=scatter
> allowWriteAffinity=no
>
> %pool:
> pool=raid6
> blockSize=4M
> usage=dataOnly
> layoutMap=scatter
> allowWriteAffinity=no
>
> %pool:
> pool=raid1
> blockSize=4M
> usage=dataOnly
> layoutMap=scatter
> allowWriteAffinity=no
>
> What am I missing or what have I done wrong? Thanks…
>
> Kevin
>> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and
> Education
> *Kevin.Buterbaugh at vanderbilt.edu* <Kevin.Buterbaugh at vanderbilt.edu>-
> (615)875-9633 <(615)%20875-9633>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180801/0fb92930/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180801/0fb92930/attachment-0002.gif>


More information about the gpfsug-discuss mailing list