<font size=2 face="Arial">Hi,</font><br><br><font size=2 color=red face="Arial">>>Am I missing something?
Is this an expected behaviour and someone has an explanation for this?</font><br><br><font size=2 face="Arial">Based on your scenario, write degradation
as the file-system is populated is possible if you had formatted the file-system
with "-j cluster". </font><br><br><font size=2 face="Arial">For consistent file-system performance, we
recommend <b>mmcrfs "-j scatter" layoutMap.</b> Also,
we need to ensure the mmcrfs "-n" is set properly.</font><br><br><font size=2 face="Arial">[snip from mmcrfs]</font><br><font size=2 color=blue face="Arial"><i># mmlsfs <fs> | egrep
'Block allocation| Estimated number'</i></font><br><font size=2 color=blue face="Arial"><i> -j
scatter
Block allocation type</i></font><br><font size=2 color=blue face="Arial"><i> -n
128
Estimated number of nodes that
will mount file system</i></font><br><font size=2 face="Arial">[/snip]</font><br><br><br><font size=2 face="Arial">[snip from man mmcrfs]</font><br><font size=2 color=blue face="Arial"><i> <b>layoutMap={scatter</i></b><i><b>|</i></b><i> <b>cluster}</i></b></font><br><font size=2 color=blue face="Arial"><i>
Specifies the block allocation map type.
When</i></font><br><font size=2 color=blue face="Arial"><i>
allocating blocks for a given file,
GPFS first</i></font><br><font size=2 color=blue face="Arial"><i>
uses a round$B!>(Brobin algorithm to spread
the data</i></font><br><font size=2 color=blue face="Arial"><i>
across all disks in the storage pool.
After a</i></font><br><font size=2 color=blue face="Arial"><i>
disk is selected, the location of the
data</i></font><br><font size=2 color=blue face="Arial"><i>
block on the disk is determined by the
block</i></font><br><font size=2 color=blue face="Arial"><i>
allocation map type<b>. If cluster is</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
specified, GPFS attempts to allocate
blocks in</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
clusters. Blocks that belong to a particular</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
file are kept adjacent to each other
within</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
each cluster. If scatter is specified,</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
the location of the block is chosen
randomly.</i></b></font><br><br><font size=2 color=blue face="Arial"><i>
<b> The cluster allocation method may
provide</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
better disk performance for some disk</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
subsystems in relatively small installations.</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
The benefits of clustered block allocation</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
diminish when the number of nodes in
the</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
cluster or the number of disks in a
file system</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
increases, or when the file system$B!G(Bs
free space</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
becomes fragmented. </i></b><i>The <b>cluster</i></b></font><br><font size=2 color=blue face="Arial"><i>
allocation method is the default for
GPFS</i></font><br><font size=2 color=blue face="Arial"><i>
clusters with eight or fewer nodes and
for file</i></font><br><font size=2 color=blue face="Arial"><i>
systems with eight or fewer disks.</i></font><br><br><font size=2 color=blue face="Arial"><i>
<b>The scatter allocation method provides</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
more consistent file system performance
by</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
averaging out performance variations
due to</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
block location (for many disk subsystems,
the</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
location of the data relative to the
disk edge</i></b></font><br><font size=2 color=blue face="Arial"><b><i>
has a substantial effect on performance).</i></b><i>This</i></font><br><font size=2 color=blue face="Arial"><i>
allocation method is appropriate in
most cases</i></font><br><font size=2 color=blue face="Arial"><i>
and is the default for GPFS clusters
with more</i></font><br><font size=2 color=blue face="Arial"><i>
than eight nodes or file systems with
more than</i></font><br><font size=2 color=blue face="Arial"><i>
eight disks.</i></font><br><br><font size=2 color=blue face="Arial"><i>
The block allocation map type cannot
be changed</i></font><br><font size=2 color=blue face="Arial"><i>
after the storage pool has been created.</i></font><br><br><br><font size=2 color=blue face="Arial"><b><i>-n</i></b><i> <b>NumNodes</i></b></font><br><font size=2 color=blue face="Arial"><i>
The estimated number of nodes that will mount the file</i></font><br><font size=2 color=blue face="Arial"><i>
system in the local cluster and all remote clusters.</i></font><br><font size=2 color=blue face="Arial"><i>
This is used as a best guess for the initial size of</i></font><br><font size=2 color=blue face="Arial"><i>
some file system data structures. The default is 32.</i></font><br><font size=2 color=blue face="Arial"><i>
This value can be changed after the file system has been</i></font><br><font size=2 color=blue face="Arial"><i>
created but it does not change the existing data</i></font><br><font size=2 color=blue face="Arial"><i>
structures. Only the newly created data structure is</i></font><br><font size=2 color=blue face="Arial"><i>
affected by the new value. For example, new storage</i></font><br><font size=2 color=blue face="Arial"><i>
pool.</i></font><br><br><font size=2 color=blue face="Arial"><i>
When you create a GPFS file system, you might want to</i></font><br><font size=2 color=blue face="Arial"><i>
overestimate the number of nodes that will mount the</i></font><br><font size=2 color=blue face="Arial"><i>
file system. GPFS uses this information for creating</i></font><br><font size=2 color=blue face="Arial"><i>
data structures that are essential for achieving maximum</i></font><br><font size=2 color=blue face="Arial"><i>
parallelism in file system operations (For more</i></font><br><font size=2 color=blue face="Arial"><i>
information, see GPFS architecture in IBM Spectrum</i></font><br><font size=2 color=blue face="Arial"><i>
Scale: Concepts, Planning, and Installation Guide ). If</i></font><br><font size=2 color=blue face="Arial"><i>
you are sure there will never be more than 64 nodes,</i></font><br><font size=2 color=blue face="Arial"><i>
allow the default value to be applied. If you are</i></font><br><font size=2 color=blue face="Arial"><i>
planning to add nodes to your system, you should specify</i></font><br><font size=2 color=blue face="Arial"><i>
a number larger than the default.</i></font><br><br><font size=2 face="Arial">[/snip from man mmcrfs]</font><br><br><font size=2 face="Arial">Regards,</font><br><font size=2 face="Arial">-Kums</font><br><br><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From:
</font><font size=1 face="sans-serif">Ivano Talamo <Ivano.Talamo@psi.ch></font><br><font size=1 color=#5f5f5f face="sans-serif">To:
</font><font size=1 face="sans-serif"><gpfsug-discuss@spectrumscale.org></font><br><font size=1 color=#5f5f5f face="sans-serif">Date:
</font><font size=1 face="sans-serif">11/15/2017 11:25 AM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject:
</font><font size=1 face="sans-serif">[gpfsug-discuss]
Write performances and filesystem size</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by:
</font><font size=1 face="sans-serif">gpfsug-discuss-bounces@spectrumscale.org</font><br><hr noshade><br><br><br><tt><font size=2>Hello everybody,<br><br>together with my colleagues we are actually running some tests on a new
<br>DSS G220 system and we see some unexpected behaviour.<br><br>What we actually see is that write performances (we did not test read <br>yet) decreases with the decrease of filesystem size.<br><br>I will not go into the details of the tests, but here are some numbers:<br><br>- with a filesystem using the full 1.2 PB space we get 14 GB/s as the <br>sum of the disk activity on the two IO servers;<br>- with a filesystem using half of the space we get 10 GB/s;<br>- with a filesystem using 1/4 of the space we get 5 GB/s.<br><br>We also saw that performances are not affected by the vdisks layout, ie.
<br>taking the full space with one big vdisk or 2 half-size vdisks per RG <br>gives the same performances.<br><br>To our understanding the IO should be spread evenly across all the <br>pdisks in the declustered array, and looking at iostat all disks seem to
<br>be accessed. But so there must be some other element that affects <br>performances.<br><br>Am I missing something? Is this an expected behaviour and someone has an
<br>explanation for this?<br><br>Thank you,<br>Ivano<br>_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at spectrumscale.org<br></font></tt><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e="><tt><font size=2>https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=</font></tt></a><tt><font size=2><br><br></font></tt><br><BR>