[gpfsug-discuss] subblock sanity check in 5.0

Carl mutantllama at gmail.com
Mon Jul 2 10:57:11 BST 2018


 Thanks Olaf and Sven,

It looks like a lot of advice from the wiki (
https://www.ibm.com/developerworks/community/wikis/home?lang=en-us#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Data%20and%20Metadata)
is no longer relevant for version 5. Any idea if its likely to be updated
soon?

The new subblock changes appear to have removed a lot of reasons for using
smaller block sizes. In broad terms there any situations where you would
recommend using less than the new default block size?

Cheers,

Carl.


On Mon, 2 Jul 2018 at 17:55, Sven Oehme <oehmes at gmail.com> wrote:

> Olaf, he is talking about indirect size not subblock size .
>
> Carl,
>
> here is a screen shot of a 4mb filesystem :
>
> [root at p8n15hyp ~]# mmlsfs all_local
>
> File system attributes for /dev/fs2-4m-07:
> ==========================================
> flag                value                    description
> ------------------- ------------------------
> -----------------------------------
>  -f                 8192                     Minimum fragment (subblock)
> size in bytes
>  -i                 4096                     Inode size in bytes
>  -I                 32768                    Indirect block size in bytes
>  -m                 1                        Default number of metadata
> replicas
>  -M                 2                        Maximum number of metadata
> replicas
>  -r                 1                        Default number of data
> replicas
>  -R                 2                        Maximum number of data
> replicas
>  -j                 scatter                  Block allocation type
>  -D                 nfs4                     File locking semantics in
> effect
>  -k                 all                      ACL semantics in effect
>  -n                 512                      Estimated number of nodes
> that will mount file system
>  -B                 4194304                  Block size
>  -Q                 none                     Quotas accounting enabled
>                     none                     Quotas enforced
>                     none                     Default quotas enabled
>  --perfileset-quota No                       Per-fileset quota enforcement
>  --filesetdf        No                       Fileset df enabled?
>  -V                 19.01 (5.0.1.0)          File system version
>  --create-time      Mon Jun 18 12:30:54 2018 File system creation time
>  -z                 No                       Is DMAPI enabled?
>  -L                 33554432                 Logfile size
>  -E                 Yes                      Exact mtime mount option
>  -S                 relatime                 Suppress atime mount option
>  -K                 whenpossible             Strict replica allocation
> option
>  --fastea           Yes                      Fast external attributes
> enabled?
>  --encryption       No                       Encryption enabled?
>  --inode-limit      4000000000               Maximum number of inodes
>  --log-replicas     0                        Number of log replicas
>  --is4KAligned      Yes                      is4KAligned?
>  --rapid-repair     Yes                      rapidRepair enabled?
>  --write-cache-threshold 0                   HAWC Threshold (max 65536)
>  --subblocks-per-full-block 512              Number of subblocks per full
> block
>  -P                 system                   Disk storage pools in file
> system
>  --file-audit-log   No                       File Audit Logging enabled?
>  --maintenance-mode No                       Maintenance Mode enabled?
>  -d                 RG001VS001;RG002VS001;RG003VS002;RG004VS002  Disks in
> file system
>  -A                 no                       Automatic mount option
>  -o                 none                     Additional mount options
>  -T                 /gpfs/fs2-4m-07          Default mount point
>  --mount-priority   0                        Mount priority
>
> as you can see indirect size is 32k
>
> sven
>
> On Mon, Jul 2, 2018 at 9:46 AM Olaf Weiser <olaf.weiser at de.ibm.com> wrote:
>
>> HI Carl,
>> 8k for 4 M Blocksize
>> files < ~3,x KB fits into the inode    , for "larger" files (> 3,x KB) at
>> least one "subblock"  be allocated  ..
>>
>> in R < 5.x ... it was fixed  1/32 from blocksize  so subblocksize is
>> retrieved from the blocksize ...
>> since R >5 (so new created file systems) .. the new default block size is
>> 4 MB, fragment size is 8k (512 subblocks)
>> for even larger block sizes ... more subblocks are available per block
>> so e.g.
>> 8M .... 1024 subblocks (fragment size is 8 k again)
>>
>> @Sven.. correct me, if I'm wrong ...
>>
>>
>>
>>
>>
>>
>> From:        Carl <mutantllama at gmail.com>
>>
>> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> Date:        07/02/2018 08:55 AM
>> Subject:        Re: [gpfsug-discuss] subblock sanity check in 5.0
>> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>> ------------------------------
>>
>>
>>
>> Hi Sven,
>>
>> What is the resulting indirect-block size with a 4mb metadata block size?
>>
>> Does the new sub-block magic mean that it will take up 32k, or will it
>> occupy 128k?
>>
>> Cheers,
>>
>> Carl.
>>
>>
>> On Mon, 2 Jul 2018 at 15:26, Sven Oehme <*oehmes at gmail.com*
>> <oehmes at gmail.com>> wrote:
>> Hi,
>>
>> most traditional raid controllers can't deal well with blocksizes above
>> 4m, which is why the new default is 4m and i would leave it at that unless
>> you know for sure you get better performance with 8mb which typically
>> requires your raid controller volume full block size to be 8mb with maybe a
>> 8+2p @1mb strip size (many people confuse strip size with full track size) .
>> if you don't have dedicated SSDs for metadata i would recommend to just
>> use a 4mb blocksize with mixed data and metadata disks, if you have a
>> reasonable number of SSD's put them in a raid 1 or raid 10 and use them as
>> dedicated metadata and the other disks as dataonly , but i would not use
>> the --metadata-block-size parameter as it prevents the datapool to use
>> large number of subblocks.
>> as long as your SSDs are on raid 1 or 10 there is no read/modify/write
>> penalty, so using them with the 4mb blocksize has no real negative impact
>> at least on controllers i have worked with.
>>
>> hope this helps.
>>
>> On Tue, Jun 26, 2018 at 5:18 PM Joseph Mendoza <*jam at ucar.edu*
>> <jam at ucar.edu>> wrote:
>> Hi, it's for a traditional NSD setup.
>>
>> --Joey
>>
>>
>> On 6/26/18 12:21 AM, Sven Oehme wrote:
>> Joseph,
>>
>> the subblocksize will be derived from the smallest blocksize in the
>> filesytem, given you specified a metadata block size of 512k thats what
>> will be used to calculate the number of subblocks, even your data pool is
>> 4mb.
>> is this setup for a traditional NSD Setup or for GNR as the
>> recommendations would be different.
>>
>> sven
>>
>> On Tue, Jun 26, 2018 at 2:59 AM Joseph Mendoza <*jam at ucar.edu*
>> <jam at ucar.edu>> wrote:
>> Quick question, anyone know why GPFS wouldn't respect the default for
>> the subblocks-per-full-block parameter when creating a new filesystem?
>> I'd expect it to be set to 512 for an 8MB block size but my guess is
>> that also specifying a metadata-block-size is interfering with it (by
>> being too small).  This was a parameter recommended by the vendor for a
>> 4.2 installation with metadata on dedicated SSDs in the system pool, any
>> best practices for 5.0?  I'm guessing I'd have to bump it up to at least
>> 4MB to get 512 subblocks for both pools.
>>
>> fs1 created with:
>> # mmcrfs fs1 -F fs1_ALL -A no -B 8M -i 4096 -m 2 -M 2 -r 1 -R 2 -j
>> cluster -n 9000 --metadata-block-size 512K --perfileset-quota
>> --filesetdf -S relatime -Q yes --inode-limit 20000000:10000000 -T
>> /gpfs/fs1
>>
>> # mmlsfs fs1
>> <snipped>
>>
>> flag                value                    description
>> ------------------- ------------------------
>> -----------------------------------
>>  -f                 8192                     Minimum fragment (subblock)
>> size in bytes (system pool)
>>                     131072                   Minimum fragment (subblock)
>> size in bytes (other pools)
>>  -i                 4096                     Inode size in bytes
>>  -I                 32768                    Indirect block size in bytes
>>
>>  -B                 524288                   Block size (system pool)
>>                     8388608                  Block size (other pools)
>>
>>  -V                 19.01 (5.0.1.0)          File system version
>>
>>  --subblocks-per-full-block 64               Number of subblocks per
>> full block
>>  -P                 system;DATA              Disk storage pools in file
>> system
>>
>>
>> Thanks!
>> --Joey Mendoza
>> NCAR
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org>
>> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
>> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org>
>> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
>> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org>
>> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
>> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180702/530662f6/attachment-0002.htm>


More information about the gpfsug-discuss mailing list