<div dir="ltr">the number of subblocks is derived by the smallest blocksize in any pool of a given filesystem. so if you pick a metadata blocksize of 1M it will be 8k in the metadata pool, but 4 x of that in the data pool if your data pool is 4M.<div><br></div><div>sven</div><div><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Aug 1, 2018 at 11:21 AM Felipe Knop <<a href="mailto:knop@us.ibm.com">knop@us.ibm.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><p><font size="2">Marc, Kevin,</font><br><br><font size="2">We'll be looking into this issue, since at least at a first glance, it does look odd. A 4MB block size should have resulted in an 8KB subblock size. I suspect that, somehow, the </font><b><font size="2">--metadata-block-size</font></b><b><font size="2"> 1M</font></b><font size="2"> may have resulted in </font><br></p></div><div><p><br> 32768 Minimum fragment (subblock) size in bytes (other pools)<br><br></p></div><div><p><font size="2">but I do not yet understand how.</font><br><br><font size="2">The </font><b><font size="2">subblocks-per-full-block</font></b><font size="2"> parameter is not supported with </font><b><font size="2">mmcrfs </font></b><font size="2">.</font><br><br><font size="2"> Felipe</font><br><br><font size="2">----<br>Felipe Knop <a href="mailto:knop@us.ibm.com" target="_blank">knop@us.ibm.com</a><br>GPFS Development and Security<br>IBM Systems<br>IBM Building 008<br>2455 South Rd, Poughkeepsie, NY 12601<br><a href="tel:(845)%20433-9314" value="+18454339314" target="_blank">(845) 433-9314</a> T/L 293-9314<br><br></font><br><br></p><img src="cid:164f6c93466308fca931" alt="graycol.gif" class="" style="max-width: 100%;"><font size="2" color="#424282">"Marc A Kaplan" ---08/01/2018 01:21:23 PM---I haven't looked into all the details but here's a clue -- notice there is only one "subblocks-per-</font><br><br><font size="2" color="#5F5F5F">From: </font><font size="2">"Marc A Kaplan" <<a href="mailto:makaplan@us.ibm.com" target="_blank">makaplan@us.ibm.com</a>></font><p></p></div><div><p><br><font size="2" color="#5F5F5F">To: </font><font size="2">gpfsug main discussion list <<a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a>></font><br></p></div><div><p><font size="2" color="#5F5F5F">Date: </font><font size="2">08/01/2018 01:21 PM</font><br><font size="2" color="#5F5F5F">Subject: </font><font size="2">Re: [gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem?</font></p></div><div><p><br><font size="2" color="#5F5F5F">Sent by: </font><font size="2"><a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a></font><br></p><hr width="100%" size="2" align="left" noshade style="color:#8091a5"><br><br><br><font size="2">I haven't looked into all the details but here's a clue -- notice there is only one "subblocks-per-full-block" parameter. </font><br><font size="2"><br>And it is the same for both metadata blocks and datadata blocks.<br><br>So maybe (MAYBE) that is a constraint somewhere...</font><br><font size="2"><br>Certainly, in the currently supported code, that's what you get.</font><br><br><br><br><font size="2" color="#5F5F5F"><br>From: </font><font size="2">"Buterbaugh, Kevin L" <Kevin.Buterbaugh@Vanderbilt.Edu></font><font size="2" color="#5F5F5F"><br>To: </font><font size="2">gpfsug main discussion list <<a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a>></font><font size="2" color="#5F5F5F"><br>Date: </font><font size="2">08/01/2018 12:55 PM</font><font size="2" color="#5F5F5F"><br>Subject: </font><font size="2">[gpfsug-discuss] Sub-block size wrong on GPFS 5 filesystem?</font><font size="2" color="#5F5F5F"><br>Sent by: </font><font size="2"><a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">gpfsug-discuss-bounces@spectrumscale.org</a></font><br><hr width="100%" size="2" align="left" noshade><br><br><br>Hi All, <br><br>Our production cluster is still on GPFS 4.2.3.x, but in preparation for moving to GPFS 5 I have upgraded our small (7 node) test cluster to GPFS 5.0.1-1. I am setting up a new filesystem there using hardware that we recently life-cycled out of our production environment.<br><br>I “successfully” created a filesystem but I believe the sub-block size is wrong. I’m using a 4 MB filesystem block size, so according to the mmcrfs man page the sub-block size should be 8K:<br><br> Table 1. Block sizes and subblock sizes<br><br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br>| Block size | Subblock size |<br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br>| 64 KiB | 2 KiB |<br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br>| 128 KiB | 4 KiB |<br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br>| 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB |<br>| MiB, 4 MiB | |<br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br>| 8 MiB, 16 MiB | 16 KiB |<br>+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+<br><br>However, it appears that it’s 8K for the system pool but 32K for the other pools:<br><br>flag value description<br>------------------- ------------------------ -----------------------------------<br> -f 8192 Minimum fragment (subblock) size in bytes (system pool)<br> 32768 Minimum fragment (subblock) size in bytes (other pools)<br> -i 4096 Inode size in bytes<br> -I 32768 Indirect block size in bytes<br> -m 2 Default number of metadata replicas<br> -M 3 Maximum number of metadata replicas<br> -r 1 Default number of data replicas<br> -R 3 Maximum number of data replicas<br> -j scatter Block allocation type<br> -D nfs4 File locking semantics in effect<br> -k all ACL semantics in effect<br> -n 32 Estimated number of nodes that will mount file system<br> -B 1048576 Block size (system pool)<br> 4194304 Block size (other pools)<br> -Q user;group;fileset Quotas accounting enabled<br> user;group;fileset Quotas enforced<br> none Default quotas enabled<br> --perfileset-quota No Per-fileset quota enforcement<br> --filesetdf No Fileset df enabled?<br> -V 19.01 (5.0.1.0) File system version<br> --create-time Wed Aug 1 11:39:39 2018 File system creation time<br> -z No Is DMAPI enabled?<br> -L 33554432 Logfile size<br> -E Yes Exact mtime mount option<br> -S relatime Suppress atime mount option<br> -K whenpossible Strict replica allocation option<br> --fastea Yes Fast external attributes enabled?<br> --encryption No Encryption enabled?<br> --inode-limit 101095424 Maximum number of inodes<br> --log-replicas 0 Number of log replicas<br> --is4KAligned Yes is4KAligned?<br> --rapid-repair Yes rapidRepair enabled?<br> --write-cache-threshold 0 HAWC Threshold (max 65536)<br> --subblocks-per-full-block 128 Number of subblocks per full block<br> -P system;raid1;raid6 Disk storage pools in file system<br> --file-audit-log No File Audit Logging enabled?<br> --maintenance-mode No Maintenance Mode enabled?<br> -d test21A3nsd;test21A4nsd;test21B3nsd;test21B4nsd;test23Ansd;test23Bnsd;test23Cnsd;test24Ansd;test24Bnsd;test24Cnsd;test25Ansd;test25Bnsd;test25Cnsd Disks in file system<br> -A yes Automatic mount option<br> -o none Additional mount options<br> -T /gpfs5 Default mount point<br> --mount-priority 0 Mount priority<br><br>Output of mmcrfs:<br><br>mmcrfs gpfs5 -F ~/gpfs/gpfs5.stanza -A yes -B 4M -E yes -i 4096 -j scatter -k all -K whenpossible -m 2 -M 3 -n 32 -Q yes -r 1 -R 3 -T /gpfs5 -v yes --nofilesetdf --metadata-block-size 1M<br><br>The following disks of gpfs5 will be formatted on node testnsd3:<br> test21A3nsd: size 953609 MB<br> test21A4nsd: size 953609 MB<br> test21B3nsd: size 953609 MB<br> test21B4nsd: size 953609 MB<br> test23Ansd: size 15259744 MB<br> test23Bnsd: size 15259744 MB<br> test23Cnsd: size 1907468 MB<br> test24Ansd: size 15259744 MB<br> test24Bnsd: size 15259744 MB<br> test24Cnsd: size 1907468 MB<br> test25Ansd: size 15259744 MB<br> test25Bnsd: size 15259744 MB<br> test25Cnsd: size 1907468 MB<br>Formatting file system ...<br>Disks up to size 8.29 TB can be added to storage pool system.<br>Disks up to size 16.60 TB can be added to storage pool raid1.<br>Disks up to size 132.62 TB can be added to storage pool raid6.<br>Creating Inode File<br> 8 % complete on Wed Aug 1 11:39:19 2018<br> 18 % complete on Wed Aug 1 11:39:24 2018<br> 27 % complete on Wed Aug 1 11:39:29 2018<br> 37 % complete on Wed Aug 1 11:39:34 2018<br> 48 % complete on Wed Aug 1 11:39:39 2018<br> 60 % complete on Wed Aug 1 11:39:44 2018<br> 72 % complete on Wed Aug 1 11:39:49 2018<br> 83 % complete on Wed Aug 1 11:39:54 2018<br> 95 % complete on Wed Aug 1 11:39:59 2018<br> 100 % complete on Wed Aug 1 11:40:01 2018<br>Creating Allocation Maps<br>Creating Log Files<br> 3 % complete on Wed Aug 1 11:40:07 2018<br> 28 % complete on Wed Aug 1 11:40:14 2018<br> 53 % complete on Wed Aug 1 11:40:19 2018<br> 78 % complete on Wed Aug 1 11:40:24 2018<br> 100 % complete on Wed Aug 1 11:40:25 2018<br>Clearing Inode Allocation Map<br>Clearing Block Allocation Map<br>Formatting Allocation Map for storage pool system<br> 85 % complete on Wed Aug 1 11:40:32 2018<br> 100 % complete on Wed Aug 1 11:40:33 2018<br>Formatting Allocation Map for storage pool raid1<br> 53 % complete on Wed Aug 1 11:40:38 2018<br> 100 % complete on Wed Aug 1 11:40:42 2018<br>Formatting Allocation Map for storage pool raid6<br> 20 % complete on Wed Aug 1 11:40:47 2018<br> 39 % complete on Wed Aug 1 11:40:52 2018<br> 60 % complete on Wed Aug 1 11:40:57 2018<br> 79 % complete on Wed Aug 1 11:41:02 2018<br> 100 % complete on Wed Aug 1 11:41:08 2018<br>Completed creation of file system /dev/gpfs5.<br>mmcrfs: Propagating the cluster configuration data to all<br> affected nodes. This is an asynchronous process.<br><br>And contents of stanza file:<br><br>%nsd:<br> nsd=test21A3nsd<br> usage=metadataOnly<br> failureGroup=210<br> pool=system<br> servers=testnsd3,testnsd1,testnsd2<br> device=dm-15<br><br>%nsd:<br> nsd=test21A4nsd<br> usage=metadataOnly<br> failureGroup=210<br> pool=system<br> servers=testnsd1,testnsd2,testnsd3<br> device=dm-14<br><br>%nsd:<br> nsd=test21B3nsd<br> usage=metadataOnly<br> failureGroup=211<br> pool=system<br> servers=testnsd1,testnsd2,testnsd3<br> device=dm-17<br><br>%nsd:<br> nsd=test21B4nsd<br> usage=metadataOnly<br> failureGroup=211<br> pool=system<br> servers=testnsd2,testnsd3,testnsd1<br> device=dm-16<br><br>%nsd:<br> nsd=test23Ansd<br> usage=dataOnly<br> failureGroup=23<br> pool=raid6<br> servers=testnsd2,testnsd3,testnsd1<br> device=dm-10<br><br>%nsd:<br> nsd=test23Bnsd<br> usage=dataOnly<br> failureGroup=23<br> pool=raid6<br> servers=testnsd3,testnsd1,testnsd2<br> device=dm-9<br><br>%nsd:<br> nsd=test23Cnsd<br> usage=dataOnly<br> failureGroup=23<br> pool=raid1<br> servers=testnsd1,testnsd2,testnsd3<br> device=dm-5<br><br>%nsd:<br> nsd=test24Ansd<br> usage=dataOnly<br> failureGroup=24<br> pool=raid6<br> servers=testnsd3,testnsd1,testnsd2<br> device=dm-6<br><br>%nsd:<br> nsd=test24Bnsd<br> usage=dataOnly<br> failureGroup=24<br> pool=raid6<br> servers=testnsd1,testnsd2,testnsd3<br> device=dm-0<br><br>%nsd:<br> nsd=test24Cnsd<br> usage=dataOnly<br> failureGroup=24<br> pool=raid1<br> servers=testnsd2,testnsd3,testnsd1<br> device=dm-2<br><br>%nsd:<br> nsd=test25Ansd<br> usage=dataOnly<br> failureGroup=25<br> pool=raid6<br> servers=testnsd1,testnsd2,testnsd3<br> device=dm-6<br><br>%nsd:<br> nsd=test25Bnsd<br> usage=dataOnly<br> failureGroup=25<br> pool=raid6<br> servers=testnsd2,testnsd3,testnsd1<br> device=dm-6<br><br>%nsd:<br> nsd=test25Cnsd<br> usage=dataOnly<br> failureGroup=25<br> pool=raid1<br> servers=testnsd3,testnsd1,testnsd2<br> device=dm-3<br><br>%pool:<br> pool=system<br> blockSize=1M<br> usage=metadataOnly<br> layoutMap=scatter<br> allowWriteAffinity=no<br><br>%pool:<br> pool=raid6<br> blockSize=4M<br> usage=dataOnly<br> layoutMap=scatter<br> allowWriteAffinity=no<br><br>%pool:<br> pool=raid1<br> blockSize=4M<br> usage=dataOnly<br> layoutMap=scatter<br> allowWriteAffinity=no<br><br>What am I missing or what have I done wrong? Thanks…<br><br>Kevin<br>—<br>Kevin Buterbaugh - Senior System Administrator<br>Vanderbilt University - Advanced Computing Center for Research and Education<u><font color="#0000FF"><br></font></u><a href="mailto:Kevin.Buterbaugh@vanderbilt.edu" target="_blank"><u><font color="#0000FF">Kevin.Buterbaugh@vanderbilt.edu</font></u></a>- <a href="tel:(615)%20875-9633" value="+16158759633" target="_blank">(615)875-9633</a><br><br><tt><font size="2"><br>_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a></font></tt><u><font color="#0000FF"><br></font></u><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank"><tt><u><font size="2" color="#0000FF">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</font></u></tt></a><br><br><br><tt><font size="2">_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br></font></tt><tt><font size="2"><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a></font></tt><tt><font size="2"><br></font></tt><br><br><br>
<p></p></div>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>
</blockquote></div></div></div>