[gpfsug-discuss] Metadata with GNR code
Jan-Frode Myklebust
janfrode at tanso.net
Fri Sep 21 11:13:51 BST 2018
That reminds me of a point Sven made when I was trying to optimize mdtest
results with metadata on FlashSystem... He sent me the following:
-- started at 11/15/2015 15:20:39 --
mdtest-1.9.3 was launched with 138 total task(s) on 23 node(s)
Command line used: /ghome/oehmes/mpi/bin/mdtest-pcmpi9131-existingdir -d
/ibm/fs2-4m-02/shared/mdtest-ec -i 1 -n 70000 -F -i 1 -w 0 -Z -u
Path: /ibm/fs2-4m-02/
sharedFS: 32.0 TiB Used FS: 6.7% Inodes: 145.4 Mi Used Inodes: 22.0%
138 tasks, 9660000 files
SUMMARY: (of 1 iterations)
Operation Max Min Mean
Std Dev
--------- --- --- ----
-------
File creation : 650440.486 650440.486 650440.486
0.000
File stat : 23599134.618 23599134.618 23599134.618
0.000
File read : 2171391.097 2171391.097 2171391.097
0.000
File removal : 1007566.981 1007566.981 1007566.981
0.000
Tree creation : 3.072 3.072 3.072
0.000
Tree removal : 1.471 1.471 1.471
0.000
-- finished at 11/15/2015 15:21:10 --
from a GL6 -- only spinning disks -- pointing out that mdtest doesn't
really require Flash/SSD. The key to good results are
a) large GPFS log ( mmchfs -L 128m)
b) high maxfilestocache (you need to be able to cache all entries , so for
10 million across 20 nodes you need to have at least 750k per node)
c) fast network, thats key to handle the token requests and metadata
operations that need to get over the network.
-jf
On Fri, Sep 21, 2018 at 10:22 AM Olaf Weiser <olaf.weiser at de.ibm.com> wrote:
> see a mdtest for a default block size file system ...
> 4 MB blocksize..
> mdata is on SSD
> data is on HDD ... which is not really relevant for this mdtest ;-)
>
>
> -- started at 09/07/2018 06:54:54 --
>
> mdtest-1.9.3 was launched with 40 total task(s) on 20 node(s)
> Command line used: mdtest -n 25000 -i 3 -u -d
> /homebrewed/gh24_4m_4m/mdtest
> Path: /homebrewed/gh24_4m_4m
> FS: 10.0 TiB Used FS: 0.0% Inodes: 12.0 Mi Used Inodes: 2.3%
>
> 40 tasks, 1000000 files/directories
>
> SUMMARY: (of 3 iterations)
> Operation Max Min Mean
> Std Dev
> --------- --- --- ----
> -------
> Directory creation: 449160.409 430869.822 437002.187
> 8597.272
> Directory stat : 6664420.560 5785712.544 6324276.731
> 385192.527
> Directory removal : 398360.058 351503.369 371630.648
> 19690.580
> File creation : 288985.217 270550.129 279096.800
> 7585.659
> File stat : 6720685.117 6641301.499 6674123.407
> 33833.182
> File read : 3055661.372 2871044.881 2945513.966
> 79479.638
> File removal : 215187.602 146639.435 179898.441
> 28021.467
> Tree creation : 10.215 3.165 6.603
> 2.881
> Tree removal : 5.484 0.880 2.418
> 2.168
>
> -- finished at 09/07/2018 06:55:42 --
>
>
>
>
> Mit freundlichen Grüßen / Kind regards
>
>
> Olaf Weiser
>
> EMEA Storage Competence Center Mainz, German / IBM Systems, Storage
> Platform,
>
> -------------------------------------------------------------------------------------------------------------------------------------------
> IBM Deutschland
> IBM Allee 1
> 71139 Ehningen
> Phone: +49-170-579-44-66
> E-Mail: olaf.weiser at de.ibm.com
>
> -------------------------------------------------------------------------------------------------------------------------------------------
> IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
> Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert
> Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
> HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
>
>
> From: "Andrew Beattie" <abeattie at au1.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Date: 09/21/2018 02:34 AM
> Subject: Re: [gpfsug-discuss] Metadata with GNR code
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Simon,
>
> My recommendation is still very much to use SSD for Metadata and NL-SAS
> for data and
> the GH14 / GH24 Building blocks certainly make this much easier.
>
> Unless your filesystem is massive (Summit sized) you will typically still
> continue to benefit from the Random IO performance of SSD (even RI SSD) in
> comparison to NL-SAS.
>
> It still makes more sense to me to continue to use 2 copy or 3 copy for
> Metadata even in ESS / GNR style environments. The read performance for
> metadata using 3copy is still significantly better than any other scenario.
>
> As with anything there are exceptions to the rule, but my experiences with
> ESS and ESS with SSD so far still maintain that the standard thoughts on
> managing Metadata and Small file IO remain the same -- even with the
> improvements around sub blocks with Scale V5.
>
> MDtest is still the typical benchmark for this comparison and MDTest shows
> some very clear differences even on SSD when you use a large filesystem
> block size with more sub blocks vs a smaller block size with 1/32 subblocks
>
> This only gets worse if you change the storage media from SSD to NL-SAS
> *Andrew Beattie*
> *Software Defined Storage - IT Specialist*
> *Phone: *614-2133-7927
> *E-mail: **abeattie at au1.ibm.com* <abeattie at au1.ibm.com>
>
>
> ----- Original message -----
> From: Simon Thompson <S.J.Thompson at bham.ac.uk>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
> Cc:
> Subject: [gpfsug-discuss] Metadata with GNR code
> Date: Fri, Sep 21, 2018 3:29 AM
>
> Just wondering if anyone has any strong views/recommendations with
> metadata when using GNR code?
>
>
>
> I know in “san” based GPFS, there is a recommendation to have data and
> metadata split with the metadata on SSD.
>
>
>
> I’ve also heard that with GNR there isn’t much difference in splitting
> data and metadata.
>
>
>
> We’re looking at two systems and want to replicate metadata, but not data
> (mostly) between them, so I’m not really sure how we’d do this without
> having separate system pool (and then NSDs in different failure groups)….
>
>
>
> If we used 8+2P vdisks for metadata only, would we still see no difference
> in performance compared to mixed (I guess the 8+2P is still spread over a
> DA so we’d get half the drives in the GNR system active…).
>
>
>
> Or should we stick SSD based storage in as well for the metadata pool?
> (Which brings an interesting question about RAID code related to the recent
> discussions on mirroring vs RAID5…)
>
>
>
> Thoughts welcome!
>
>
>
> Simon
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180921/9a997782/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list