[gpfsug-discuss] GPFS, Pagepool and Block size -> Perfomance reduces with larger block size

Sven Oehme oehmes at gmail.com
Mon Oct 22 17:18:43 BST 2018


oops, somehow that slipped my inbox, i only saw that reply right now.

its really hard to see from a trace snipped if the lock is the issue as the
lower level locks don't show up in default traces. without having access to
source code and a detailed trace you won't make much progress here.

sven





On Thu, Sep 27, 2018 at 12:31 PM <valleru at cbio.mskcc.org> wrote:

> Thank you Sven,
>
> Turning of prefetching did not improve the performance, but it did degrade
> a bit.
>
> I have made the prefetching default and took trace dump, for tracectl with
> trace=io. Let me know if you want me to paste/attach it here.
>
> May i know, how could i confirm if the below is true?
>
> 1. this could be serialization around buffer locks. as larger your
>>> blocksize gets as larger is the amount of data one of this pagepool buffers
>>> will maintain, if there is a lot of concurrency on smaller amount of data
>>> more threads potentially compete for the same buffer lock to copy stuff in
>>> and out of a particular buffer, hence things go slower compared to the same
>>> amount of data spread across more buffers, each of smaller size.
>>>
>>>
> Will the above trace help in understanding if it is a serialization issue?
>
> I had been discussing the same with GPFS support for past few months, and
> it seems to be that most of the time is being spent at cxiUXfer. They could
> not understand on why it is taking spending so much of time in cxiUXfer. I
> was seeing the same from perf top, and pagefaults.
>
> Below is snippet from what the support had said :
>
> ————————————————————————————
>
> I searched all of the gpfsRead from trace and sort them by spending-time.
> Except 2 reads which need fetch data from nsd server, the slowest read is
> in the thread 72170. It took 112470.362 us.
>
>
> trcrpt.2018-08-06_12.27.39.55538.lt15.trsum:   72165       6.860911319
> rdwr                   141857.076 us + NSDIO
>
> trcrpt.2018-08-06_12.26.28.39794.lt15.trsum:   72170       1.483947593
> rdwr                   112470.362 us + cxiUXfer
>
> trcrpt.2018-08-06_12.27.39.55538.lt15.trsum:   72165       6.949042593
> rdwr                    88126.278 us + NSDIO
>
> trcrpt.2018-08-06_12.27.03.47706.lt15.trsum:   72156       2.919334474
> rdwr                    81057.657 us + cxiUXfer
>
> trcrpt.2018-08-06_12.23.30.72745.lt15.trsum:   72154       1.167484466
> rdwr                    76033.488 us + cxiUXfer
>
> trcrpt.2018-08-06_12.24.06.7508.lt15.trsum:   72187       0.685237501
> rdwr                    70772.326 us + cxiUXfer
>
> trcrpt.2018-08-06_12.25.17.23989.lt15.trsum:   72193       4.757996530
> rdwr                    70447.838 us + cxiUXfer
>
>
> I check each of the slow IO as above, and find they all spend much time in
> the function cxiUXfer. This function is used to copy data from kernel
> buffer to user buffer. I am not sure why it took so much time. This should
> be related to the pagefaults and pgfree you observed. Below is the trace
> data for thread 72170.
>
>
>                    1.371477231  72170 TRACE_VNODE: gpfs_f_rdwr enter: fP
> 0xFFFF882541649400 f_flags 0x8000 flags 0x8001 op 0 iovec
> 0xFFFF881F2AFB3E70 count 1 offset 0x168F30D dentry 0xFFFF887C0CC298C0
> private 0xFFFF883F607175C0 iP 0xFFFF8823AA3CBFC0 name '410513.svs'
>
>               ....
>
>                    1.371483547  72170 TRACE_KSVFS: cachedReadFast exit:
> uio_resid 16777216 code 1 err 11
>
>               ....
>
>                    1.371498780  72170 TRACE_KSVFS: kSFSReadFast: oiP
> 0xFFFFC90060B46740 offset 0x168F30D dataBufP FFFFC9003645A5A8 nDesc 64 buf
> 200043C0000 valid words 64 dirty words 0 blkOff 0
>
>                    1.371499035  72170 TRACE_LOG:
> UpdateLogger::beginDataUpdate begin ul 0xFFFFC900333F1A40 holdCount 0
> ioType 0x2 inProg 0x15
>
>                    1.371500157  72170 TRACE_LOG:
> UpdateLogger::beginDataUpdate ul 0xFFFFC900333F1A40 holdCount 0 ioType 0x2
> inProg 0x16 err 0
>
>                    1.371500606  72170 TRACE_KSVFS: cxiUXfer: nDesc 64 1st
> dataPtr 0x200043C0000 plP 0xFFFF887F7B90D600 toIOBuf 0 offset 6877965 len
> 9899251
>
>                    1.371500793  72170 TRACE_KSVFS: cxiUXfer: ndesc 0 skip
> dataAddrP 0x200043C0000 currOffset 0 currLen 262144 bufOffset 6877965
>
>               ....
>
>                    1.371505949  72170 TRACE_KSVFS: cxiUXfer: ndesc 25 skip
> dataAddrP 0x2001AF80000 currOffset 6553600 currLen 262144 bufOffset 6877965
>
>                    1.371506236  72170 TRACE_KSVFS: cxiUXfer: nDesc 26
> currOffset 6815744 tmpLen 262144 dataAddrP 0x2001AFCF30D currLen 199923
> pageOffset 781 pageLen 3315 plP 0xFFFF887F7B90D600
>
>                    1.373649823  72170 TRACE_KSVFS: cxiUXfer: nDesc 27
> currOffset 7077888 tmpLen 262144 dataAddrP 0x20027400000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>                    1.375158799  72170 TRACE_KSVFS: cxiUXfer: nDesc 28
> currOffset 7340032 tmpLen 262144 dataAddrP 0x20027440000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>                    1.376661566  72170 TRACE_KSVFS: cxiUXfer: nDesc 29
> currOffset 7602176 tmpLen 262144 dataAddrP 0x2002C180000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>                    1.377892653  72170 TRACE_KSVFS: cxiUXfer: nDesc 30
> currOffset 7864320 tmpLen 262144 dataAddrP 0x2002C1C0000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>               ....
>
>                    1.471389843  72170 TRACE_KSVFS: cxiUXfer: nDesc 62
> currOffset 16252928 tmpLen 262144 dataAddrP 0x2001D2C0000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>                    1.471845629  72170 TRACE_KSVFS: cxiUXfer: nDesc 63
> currOffset 16515072 tmpLen 262144 dataAddrP 0x2003EC80000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B90D600
>
>                    1.472417149  72170 TRACE_KSVFS: cxiDetachIOBuffer:
> dataPtr 0x200043C0000 plP 0xFFFF887F7B90D600
>
>                    1.472417775  72170 TRACE_LOCK: unlock_vfs: type Data,
> key 0000000000000004:000000001B1F24BF:0000000000000001 lock_mode have ro
> token xw lock_state old [ ro:27 ] new [ ro:26 ] holdCount now 27
>
>                    1.472418427  72170 TRACE_LOCK: hash tab lookup vfs:
> found cP 0xFFFFC9005FC0CDE0 holdCount now 14
>
>                    1.472418592  72170 TRACE_LOCK: lock_vfs: type Data key
> 0000000000000004:000000001B1F24BF:0000000000000002 lock_mode want ro status
> valid token xw/xw lock_state [ ro:12 ] flags 0x0 holdCount 14
>
>                    1.472419842  72170 TRACE_KSVFS: kSFSReadFast: oiP
> 0xFFFFC90060B46740 offset 0x2000000 dataBufP FFFFC9003643C908 nDesc 64 buf
> 38033480000 valid words 64 dirty words 0 blkOff 0
>
>                    1.472420029  72170 TRACE_LOG:
> UpdateLogger::beginDataUpdate begin ul 0xFFFFC9005FC0CF98 holdCount 0
> ioType 0x2 inProg 0xC
>
>                    1.472420187  72170 TRACE_LOG:
> UpdateLogger::beginDataUpdate ul 0xFFFFC9005FC0CF98 holdCount 0 ioType 0x2
> inProg 0xD err 0
>
>                    1.472420652  72170 TRACE_KSVFS: cxiUXfer: nDesc 64 1st
> dataPtr 0x38033480000 plP 0xFFFF887F7B934320 toIOBuf 0 offset 0 len 6877965
>
>                    1.472420936  72170 TRACE_KSVFS: cxiUXfer: nDesc 0
> currOffset 0 tmpLen 262144 dataAddrP 0x38033480000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>                    1.472824790  72170 TRACE_KSVFS: cxiUXfer: nDesc 1
> currOffset 262144 tmpLen 262144 dataAddrP 0x380334C0000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>                    1.473243905  72170 TRACE_KSVFS: cxiUXfer: nDesc 2
> currOffset 524288 tmpLen 262144 dataAddrP 0x38024280000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>               ....
>
>                    1.482949347  72170 TRACE_KSVFS: cxiUXfer: nDesc 24
> currOffset 6291456 tmpLen 262144 dataAddrP 0x38025E80000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>                    1.483354265  72170 TRACE_KSVFS: cxiUXfer: nDesc 25
> currOffset 6553600 tmpLen 262144 dataAddrP 0x38025EC0000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>                    1.483766631  72170 TRACE_KSVFS: cxiUXfer: nDesc 26
> currOffset 6815744 tmpLen 262144 dataAddrP 0x38003B00000 currLen 262144
> pageOffset 0 pageLen 4096 plP 0xFFFF887F7B934320
>
>                    1.483943894  72170 TRACE_KSVFS: cxiDetachIOBuffer:
> dataPtr 0x38033480000 plP 0xFFFF887F7B934320
>
>                    1.483944339  72170 TRACE_LOCK: unlock_vfs: type Data,
> key 0000000000000004:000000001B1F24BF:0000000000000002 lock_mode have ro
> token xw lock_state old [ ro:14 ] new [ ro:13 ] holdCount now 14
>
>                    1.483944683  72170 TRACE_BRL: brUnlockM: ofP
> 0xFFFFC90069346B68 inode 455025855 snap 0 handle 0xFFFFC9003637D020 range
> 0x168F30D-0x268F30C mode ro
>
>                    1.483944985  72170 TRACE_KSVFS: kSFSReadFast exit:
> uio_resid 0 err 0
>
>                    1.483945264  72170 TRACE_LOCK: unlock_vfs_m: type
> Inode, key 305F105B9701E60A:000000001B1F24BF:0000000000000000 lock_mode
> have ro status valid token rs lock_state old [ ro:25 ] new [ ro:24 ]
>
>                    1.483945423  72170 TRACE_LOCK: unlock_vfs_m: cP
> 0xFFFFC90069346B68 holdCount 25
>
>                    1.483945624  72170 TRACE_VNODE: gpfsRead exit: fast err
> 0
>
>                    1.483946831  72170 TRACE_KSVFS: ReleSG: sli 38 sgP
> 0xFFFFC90035E52F78 NotQuiesced vfsOp 2
>
>                    1.483946975  72170 TRACE_KSVFS: ReleSG: sli 38 sgP
> 0xFFFFC90035E52F78 vfsOp 2 users 1-1
>
>                    1.483947116  72170 TRACE_KSVFS: ReleaseDaemonSegAndSG:
> sli 38 count 2 needCleanup 0
>
>                    1.483947593  72170 TRACE_VNODE: gpfs_f_rdwr exit: fP
> 0xFFFF882541649400 total_len 16777216 uio_resid 0 offset 0x268F30D rc 0
>
>
> ———————————————————————————————————————————
>
>
>
> Regards,
> Lohit
>
> On Sep 19, 2018, 3:11 PM -0400, Sven Oehme <oehmes at gmail.com>, wrote:
>
> the document primarily explains all performance specific knobs. general
> advice would be to longer set anything beside workerthreads, pagepool and
> filecache on 5.X systems as most other settings are no longer relevant
> (thats a client side statement) . thats is true until you hit strange
> workloads , which is why all the knobs are still there :-)
>
> sven
>
>
> On Wed, Sep 19, 2018 at 11:17 AM <valleru at cbio.mskcc.org> wrote:
>
>> Thanks Sven.
>> I will disable it completely and see how it behaves.
>>
>> Is this the presentation?
>>
>> http://files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
>>
>> I guess i read it, but it did not strike me at this situation. I will try
>> to read it again and see if i could make use of it.
>>
>> Regards,
>> Lohit
>>
>> On Sep 19, 2018, 2:12 PM -0400, Sven Oehme <oehmes at gmail.com>, wrote:
>>
>> seem like you never read my performance presentation from a few years ago
>> ;-)
>>
>> you can control this on a per node basis , either for all i/o :
>>
>>    prefetchAggressiveness = X
>>
>> or individual for reads or writes :
>>
>>    prefetchAggressivenessRead = X
>>    prefetchAggressivenessWrite = X
>>
>> for a start i would turn it off completely via :
>>
>> mmchconfig prefetchAggressiveness=0 -I -N nodename
>>
>> that will turn it off only for that node and only until you restart the
>> node.
>> then see what happens
>>
>> sven
>>
>>
>> On Wed, Sep 19, 2018 at 11:07 AM <valleru at cbio.mskcc.org> wrote:
>>
>>> Thank you Sven.
>>>
>>> I mostly think it could be 1. or some other issue.
>>> I don’t think it could be 2. , because i can replicate this issue no
>>> matter what is the size of the dataset. It happens for few files that could
>>> easily fit in the page pool too.
>>>
>>> I do see a lot more page faults for 16M compared to 1M, so it could be
>>> related to many threads trying to compete for the same buffer space.
>>>
>>> I will try to take the trace with trace=io option and see if can find
>>> something.
>>>
>>> How do i turn of prefetching? Can i turn it off for a single
>>> node/client?
>>>
>>> Regards,
>>> Lohit
>>>
>>> On Sep 18, 2018, 5:23 PM -0400, Sven Oehme <oehmes at gmail.com>, wrote:
>>>
>>> Hi,
>>>
>>> taking a trace would tell for sure, but i suspect what you might be
>>> hitting one or even multiple issues which have similar negative performance
>>> impacts but different root causes.
>>>
>>> 1. this could be serialization around buffer locks. as larger your
>>> blocksize gets as larger is the amount of data one of this pagepool buffers
>>> will maintain, if there is a lot of concurrency on smaller amount of data
>>> more threads potentially compete for the same buffer lock to copy stuff in
>>> and out of a particular buffer, hence things go slower compared to the same
>>> amount of data spread across more buffers, each of smaller size.
>>>
>>> 2. your data set is small'ish, lets say a couple of time bigger than the
>>> pagepool and you random access it with multiple threads. what will happen
>>> is that because it doesn't fit into the cache it will be read from the
>>> backend. if multiple threads hit the same 16 mb block at once with multiple
>>> 4k random reads, it will read the whole 16mb block because it thinks it
>>> will benefit from it later on out of cache, but because it fully random the
>>> same happens with the next block and the next and so on and before you get
>>> back to this block it was pushed out of the cache because of lack of enough
>>> pagepool.
>>>
>>> i could think of multiple other scenarios , which is why its so hard to
>>> accurately benchmark an application because you will design a benchmark to
>>> test an application, but it actually almost always behaves different then
>>> you think it does :-)
>>>
>>> so best is to run the real application and see under which configuration
>>> it works best.
>>>
>>> you could also take a trace with trace=io and then look at
>>>
>>> TRACE_VNOP: READ:
>>> TRACE_VNOP: WRITE:
>>>
>>> and compare them to
>>>
>>> TRACE_IO: QIO: read
>>> TRACE_IO: QIO: write
>>>
>>> and see if the numbers summed up for both are somewhat equal. if
>>> TRACE_VNOP is significant smaller than TRACE_IO you most likely do more i/o
>>> than you should and turning prefetching off might actually make things
>>> faster .
>>>
>>> keep in mind i am no longer working for IBM so all i say might be
>>> obsolete by now, i no longer have access to the one and only truth aka the
>>> source code ... but if i am wrong i am sure somebody will point this out
>>> soon ;-)
>>>
>>> sven
>>>
>>>
>>>
>>>
>>> On Tue, Sep 18, 2018 at 10:31 AM <valleru at cbio.mskcc.org> wrote:
>>>
>>>> Hello All,
>>>>
>>>> This is a continuation to the previous discussion that i had with Sven.
>>>> However against what i had mentioned previously - i realize that this
>>>> is “not” related to mmap, and i see it when doing random freads.
>>>>
>>>> I see that block-size of the filesystem matters when reading from Page
>>>> pool.
>>>> I see a major difference in performance when compared 1M to 16M, when
>>>> doing lot of random small freads with all of the data in pagepool.
>>>>
>>>> Performance for 1M is a magnitude “more” than the performance that i
>>>> see for 16M.
>>>>
>>>> The GPFS that we have currently is :
>>>> Version : 5.0.1-0.5
>>>> Filesystem version: 19.01 (5.0.1.0)
>>>> Block-size : 16M
>>>>
>>>> I had made the filesystem block-size to be 16M, thinking that i would
>>>> get the most performance for both random/sequential reads from 16M than the
>>>> smaller block-sizes.
>>>> With GPFS 5.0, i made use the 1024 sub-blocks instead of 32 and thus
>>>> not loose lot of storage space even with 16M.
>>>> I had run few benchmarks and i did see that 16M was performing better
>>>> “when hitting storage/disks” with respect to bandwidth for
>>>> random/sequential on small/large reads.
>>>>
>>>> However, with this particular workload - where it freads a chunk of
>>>> data randomly from hundreds of files -> I see that the number of
>>>> page-faults increase with block-size and actually reduce the performance.
>>>> 1M performs a lot better than 16M, and may be i will get better
>>>> performance with less than 1M.
>>>> It gives the best performance when reading from local disk, with 4K
>>>> block size filesystem.
>>>>
>>>> What i mean by performance when it comes to this workload - is not the
>>>> bandwidth but the amount of time that it takes to do each iteration/read
>>>> batch of data.
>>>>
>>>> I figure what is happening is:
>>>> fread is trying to read a full block size of 16M - which is good in a
>>>> way, when it hits the hard disk.
>>>> But the application could be using just a small part of that 16M. Thus
>>>> when randomly reading(freads) lot of data of 16M chunk size - it is page
>>>> faulting a lot more and causing the performance to drop .
>>>> I could try to make the application do read instead of freads, but i
>>>> fear that could be bad too since it might be hitting the disk with a very
>>>> small block size and that is not good.
>>>>
>>>> With the way i see things now -
>>>> I believe it could be best if the application does random reads of
>>>> 4k/1M from pagepool but some how does 16M from rotating disks.
>>>>
>>>> I don’t see any way of doing the above other than following a different
>>>> approach where i create a filesystem with a smaller block size ( 1M or less
>>>> than 1M ), on SSDs as a tier.
>>>>
>>>> May i please ask for advise, if what i am understanding/seeing is right
>>>> and the best solution possible for the above scenario.
>>>>
>>>> Regards,
>>>> Lohit
>>>>
>>>> On Apr 11, 2018, 10:36 AM -0400, Lohit Valleru <valleru at cbio.mskcc.org>,
>>>> wrote:
>>>>
>>>> Hey Sven,
>>>>
>>>> This is regarding mmap issues and GPFS.
>>>> We had discussed previously of experimenting with GPFS 5.
>>>>
>>>> I now have upgraded all of compute nodes and NSD nodes to GPFS 5.0.0.2
>>>>
>>>> I am yet to experiment with mmap performance, but before that - I am
>>>> seeing weird hangs with GPFS 5 and I think it could be related to mmap.
>>>>
>>>> Have you seen GPFS ever hang on this syscall?
>>>> [Tue Apr 10 04:20:13 2018] [<ffffffffa0a92155>]
>>>> _ZN10gpfsNode_t8mmapLockEiiPKj+0xb5/0x140 [mmfs26]
>>>>
>>>> I see the above ,when kernel hangs and throws out a series of trace
>>>> calls.
>>>>
>>>> I somehow think the above trace is related to processes hanging on GPFS
>>>> forever. There are no errors in GPFS however.
>>>>
>>>> Also, I think the above happens only when the mmap threads go above a
>>>> particular number.
>>>>
>>>> We had faced a similar issue in 4.2.3 and it was resolved in a patch to
>>>> 4.2.3.2 . At that time , the issue happened when mmap threads go more than
>>>> worker1threads. According to the ticket - it was a mmap race condition that
>>>> GPFS was not handling well.
>>>>
>>>> I am not sure if this issue is a repeat and I am yet to isolate the
>>>> incident and test with increasing number of mmap threads.
>>>>
>>>> I am not 100 percent sure if this is related to mmap yet but just
>>>> wanted to ask you if you have seen anything like above.
>>>>
>>>> Thanks,
>>>>
>>>> Lohit
>>>>
>>>> On Feb 22, 2018, 3:59 PM -0500, Sven Oehme <oehmes at gmail.com>, wrote:
>>>>
>>>> Hi Lohit,
>>>>
>>>> i am working with ray on a mmap performance improvement right now,
>>>> which most likely has the same root cause as yours , see -->
>>>> http://gpfsug.org/pipermail/gpfsug-discuss/2018-January/004411.html
>>>> the thread above is silent after a couple of back and rorth, but ray
>>>> and i have active communication in the background and will repost as soon
>>>> as there is something new to share.
>>>> i am happy to look at this issue after we finish with ray's workload if
>>>> there is something missing, but first let's finish his, get you try the
>>>> same fix and see if there is something missing.
>>>>
>>>> btw. if people would share their use of MMAP , what applications they
>>>> use (home grown, just use lmdb which uses mmap under the cover, etc) please
>>>> let me know so i get a better picture on how wide the usage is with GPFS. i
>>>> know a lot of the ML/DL workloads are using it, but i would like to know
>>>> what else is out there i might not think about. feel free to drop me a
>>>> personal note, i might not reply to it right away, but eventually.
>>>>
>>>> thx. sven
>>>>
>>>>
>>>> On Thu, Feb 22, 2018 at 12:33 PM <valleru at cbio.mskcc.org> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I wanted to know, how does mmap interact with GPFS pagepool with
>>>>> respect to filesystem block-size?
>>>>> Does the efficiency depend on the mmap read size and the block-size of
>>>>> the filesystem even if all the data is cached in pagepool?
>>>>>
>>>>> GPFS 4.2.3.2 and CentOS7.
>>>>>
>>>>> Here is what i observed:
>>>>>
>>>>> I was testing a user script that uses mmap to read from 100M to 500MB
>>>>> files.
>>>>>
>>>>> The above files are stored on 3 different filesystems.
>>>>>
>>>>> Compute nodes - 10G pagepool and 5G seqdiscardthreshold.
>>>>>
>>>>> 1. 4M block size GPFS filesystem, with separate metadata and data.
>>>>> Data on Near line and metadata on SSDs
>>>>> 2. 1M block size GPFS filesystem as a AFM cache cluster, "with all the
>>>>> required files fully cached" from the above GPFS cluster as home. Data and
>>>>> Metadata together on SSDs
>>>>> 3. 16M block size GPFS filesystem, with separate metadata and data.
>>>>> Data on Near line and metadata on SSDs
>>>>>
>>>>> When i run the script first time for “each" filesystem:
>>>>> I see that GPFS reads from the files, and caches into the pagepool as
>>>>> it reads, from mmdiag -- iohist
>>>>>
>>>>> When i run the second time, i see that there are no IO requests from
>>>>> the compute node to GPFS NSD servers, which is expected since all the data
>>>>> from the 3 filesystems is cached.
>>>>>
>>>>> However - the time taken for the script to run for the files in the 3
>>>>> different filesystems is different - although i know that they are just
>>>>> "mmapping"/reading from pagepool/cache and not from disk.
>>>>>
>>>>> Here is the difference in time, for IO just from pagepool:
>>>>>
>>>>> 20s 4M block size
>>>>> 15s 1M block size
>>>>> 40S 16M block size.
>>>>>
>>>>> Why do i see a difference when trying to mmap reads from different
>>>>> block-size filesystems, although i see that the IO requests are not hitting
>>>>> disks and just the pagepool?
>>>>>
>>>>> I am willing to share the strace output and mmdiag outputs if needed.
>>>>>
>>>>> Thanks,
>>>>> Lohit
>>>>>
>>>>> _______________________________________________
>>>>> gpfsug-discuss mailing list
>>>>> gpfsug-discuss at spectrumscale.org
>>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181022/51ab71b5/attachment-0001.htm>


More information about the gpfsug-discuss mailing list