[gpfsug-discuss] Same file opened by many nodes / processes

José Filipe Higino jose.filipe.higino at gmail.com
Sun Jul 22 13:51:03 BST 2018


Hi there,

Have you been able to create a test case (replicate the problem)? Can you
tell us a bit more about the setup?

Are you using GPFS API over any administrative commands? Any problems with
the network (being that Ethernet or IB)?

Sorry if I am un-announced here for the first time. But I would like to
help if I can.

Jose Higino,
from NIWA
New Zealand

Cheers

On Sun, 22 Jul 2018 at 23:26, Peter Childs <p.childs at qmul.ac.uk> wrote:

> Yes, we run mmbackup, using a snapshot.
>
> The scan usally takes an hour, but for the last week has been taking many
> hours (i saw it take 12 last Tuesday)
>
> It's speeded up again now back to its normal hour, but the high io jobs
> accessing the same file from many nodes also look to have come to an end
> for the time being.
>
> I was trying to figure out howto control the bad io using mmchqos, to
> prioritise certain nodes over others but had not worked out if that was
> possible yet.
>
> We've only previously seen this problem when we had some bad disks in our
> storage, which we replaced, I've checked and I can't see that issue
> currently.
>
> Thanks for the help.
>
>
>
> Peter Childs
> Research Storage
> ITS Research and Teaching Support
> Queen Mary, University of London
>
> ---- Yaron Daniel wrote ----
>
> Hi
>
> Do u run mmbackup on snapshot , which is read only ?
>
>
> Regards
>
> ------------------------------
>
>
>
> *Yaron Daniel*  94 Em Ha'Moshavot Rd
> *Storage Architect – IL Lab Services (Storage)*  Petach Tiqva, 49527
> *IBM Global Markets, Systems HW Sales*  Israel
>
> Phone: +972-3-916-5672
> Fax: +972-3-916-5672
> Mobile: +972-52-8395593
> e-mail: yard at il.ibm.com
> *IBM Israel* <http://www.ibm.com/il/he/>
>
>
>
>
> [image: IBM Storage Strategy and Solutions v1][image: IBM Storage
> Management and Data Protection v1] [image:
> https://acclaim-production-app.s3.amazonaws.com/images/6c2c3858-6df8-45be-ac2b-f93b8da74e20/Data%2BDriven%2BMulti%2BCloud%2BStrategy%2BV1%2Bver%2B4.png]
> [image: Related image]
>
>
>
> From:        Peter Childs <p.childs at qmul.ac.uk>
> To:        "gpfsug-discuss at spectrumscale.org" <
> gpfsug-discuss at spectrumscale.org>
> Date:        07/10/2018 05:51 PM
> Subject:        [gpfsug-discuss] Same file opened by many nodes /
> processes
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> We have an situation where the same file is being read by around 5000
> "jobs" this is an array job in uge with a tc set, so the file in
> question is being opened by about 100 processes/jobs at the same time.
>
> Its a ~200GB file so copying the file locally first is not an easy
> answer, and these jobs are causing issues with mmbackup scanning the
> file system, in that the scan is taking 3 hours instead of the normal
> 40-60 minutes.
>
> This is read only access to the file, I don't know the specifics about
> the job.
>
> It looks like the metanode is moving around a fair amount (given what I
> can see from mmfsadm saferdump file)
>
> I'm wondering if we there is anything we can do to improve things or
> that can be tuned within GPFS, I'm don't think we have an issue with
> token management, but would increasing maxFileToCache on our token
> manager node help say?
>
> Is there anything else I should look at, to try and attempt to allow
> GPFS to share this file better.
>
> Thanks in advance
>
> Peter Childs
>
> --
> Peter Childs
> ITS Research Storage
> Queen Mary, University of London
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00001.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0014.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00002.gif
Type: image/gif
Size: 4376 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0015.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00003.gif
Type: image/gif
Size: 5093 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0016.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00004.gif
Type: image/gif
Size: 4746 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0017.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00005.gif
Type: image/gif
Size: 4557 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0018.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00006.gif
Type: image/gif
Size: 5093 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0019.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00007.jpg
Type: image/jpeg
Size: 11294 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00002.gif
Type: image/gif
Size: 4376 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180723/6abb0dd4/attachment-0020.gif>


More information about the gpfsug-discuss mailing list