From knop at us.ibm.com Mon Apr 2 04:16:25 2018 From: knop at us.ibm.com (Felipe Knop) Date: Sun, 1 Apr 2018 23:16:25 -0400 Subject: [gpfsug-discuss] sublocks per block in GPFS 5.0 In-Reply-To: References: <68905b2c-8b1a-4a3d-8ded-c5aa56b765aa@Spark><18518530-0d1f-4937-b2ec-9c16c6c80995@Spark> Message-ID: Folks, Also quoting a previous post: Thanks Mark, I did not know, we could explicitly mention sub-block size when creating File system. It is no-where mentioned in the ?man mmcrfs?. Is this a new GPFS 5.0 feature? Also, i see from the ?man mmcrfs? that the default sub-block size for 8M and 16M is 16K. Specifying the number of subblocks per block or the subblock size in mmcrfs is not currently supported. The subblock size is automatically chosen based on the block size, as described in 'Table 1. Block sizes and subblock sizes' in 'man mmcrfs'. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 03/30/2018 02:48 PM Subject: Re: [gpfsug-discuss] sublocks per block in GPFS 5.0 Sent by: gpfsug-discuss-bounces at spectrumscale.org Look at my example, again, closely. I chose the blocksize as 16M and subblock size as 4K and the inodesize as 1K.... Developer works is a good resource, but articles you read there may be incomplete or contain mistakes. The official IBM Spectrum Scale cmd and admin guide documents, are "trustworthy" but may not be perfect in all respects. "Trust but Verify" and YMMV. ;-) As for why/how to choose "good sizes", that depends what objectives you want to achieve, and "optimal" may depend on what hardware you are running. Run your own trials and/or ask performance experts. There are usually "tradeoffs" and OTOH when you get down to it, some choices may not be all-that-important in actual deployment and usage. That's why we have defaults values - try those first and leave the details and tweaking aside until you have good reason ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=6_33HG_HPw9JkUKuyY_SrveiPQ_bnA4JHZ0F7l01ohc&s=HLsts8ySRm-SVYLUNhCt2SxsoP3Ph02ehKmGnqpXbPc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From scale at us.ibm.com Mon Apr 2 08:11:54 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 2 Apr 2018 12:41:54 +0530 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: Hi Alexander, Markus, Can you please try to answer the below query. Or else forward this to the right folks. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Altenburger Ingo (ID SD)" To: "gpfsug-discuss at spectrumscale.org" Date: 03/29/2018 05:57 PM Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' Sent by: gpfsug-discuss-bounces at spectrumscale.org We were very hopeful to replace our storage provisioning automation based on cli commands with the new functions provided in REST API. Since it seems that almost all protocol related commands are already implemented with 5.0.0.1 REST interface, we have still not found an equivalent for mmsmb exportacl list to get the share permissions of a share. Does anybody know that this is already in but not yet documented or is it for sure still not under consideration? Thanks Ingo _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=djjNl-TRGJKujImpbqTzsuNhnILtchzGBzZBdLJbyY0&s=4e6Azge_v1-AApWi_xNPI6V8qSW58ZOxIwFma-A6nss&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 11:41:41 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 11:41:41 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale Message-ID: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page [1]. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From richardb+gpfsUG at ellexus.com Tue Apr 3 12:28:19 2018 From: richardb+gpfsUG at ellexus.com (Richard Booth) Date: Tue, 3 Apr 2018 12:28:19 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: Hi Claire The link at the bottom of your email, doesn't appear to be working. Richard On 3 April 2018 at 12:00, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Transforming Workflows at Scale (Secretary GPFS UG) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 03 Apr 2018 11:41:41 +0100 > From: Secretary GPFS UG > To: gpfsug main discussion list > Subject: [gpfsug-discuss] Transforming Workflows at Scale > Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> > Content-Type: text/plain; charset="us-ascii" > > > > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being > organised by IBM at IBM South Bank, London on 17th April (the day before > the next UK meeting). > > The event has been designed for broadcasters, post production houses and > visual effects organisations, where managing workflows between different > islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM > Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org > > > Links: > ------ > [1] > https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp? > openform&seminar=B223GVES&locale=en_ZZ&cm_mmc= > Email_External-_-Systems_Systems+-+Hybrid+Cloud+ > Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_ > mmca1=000030YP&cm_mmca2=10001939&cvosrc=email. > External.NA&cvo_campaign=000030YP > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180403/302ad054/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 2 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 12:56:33 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 12:56:33 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Hi Richard, My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [7] If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Richard Booth wrote: > Hi Claire > > The link at the bottom of your email, doesn't appear to be working. > > Richard > > On 3 April 2018 at 12:00, wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> Today's Topics: >> >> 1. Transforming Workflows at Scale (Secretary GPFS UG) >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 03 Apr 2018 11:41:41 +0100 >> From: Secretary GPFS UG >> To: gpfsug main discussion list >> Subject: [gpfsug-discuss] Transforming Workflows at Scale >> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >> Content-Type: text/plain; charset="us-ascii" >> >> Dear all, >> >> There's a Spectrum Scale for media breakfast briefing event being >> organised by IBM at IBM South Bank, London on 17th April (the day before >> the next UK meeting). >> >> The event has been designed for broadcasters, post production houses and >> visual effects organisations, where managing workflows between different >> islands of technology is a major challenge. >> >> If you're interested, you can read more and register at the IBM >> Registration Page [1]. >> >> Thanks, >> -- >> >> Claire O'Toole >> Spectrum Scale/GPFS User Group Secretary >> +44 (0)7508 033896 [2] >> www.spectrumscaleug.org [3] >> >> Links: >> ------ >> [1] >> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [4] >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org [6] >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> >> End of gpfsug-discuss Digest, Vol 75, Issue 2 >> ********************************************* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] Links: ------ [1] http://gpfsug.org/mailman/listinfo/gpfsug-discuss [2] tel:%2B44%20%280%297508%20033896 [3] http://www.spectrumscaleug.org [4] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&amp;seminar=B223GVES&amp;locale=en_ZZ&amp;cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&amp;cm_mmca1=000030YP&amp;cm_mmca2=10001939&amp;cvosrc=email.External.NA&amp;cvo_campaign=000030YP [5] http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180403/302ad054/attachment-0001.html [6] http://spectrumscale.org [7] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.Wolf-Reber at de.ibm.com Tue Apr 3 16:26:45 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Tue, 3 Apr 2018 15:26:45 +0000 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780210.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780211.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780212.png Type: image/png Size: 1134 bytes Desc: not available URL: From john.hearns at asml.com Wed Apr 4 10:11:48 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 09:11:48 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance "If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command." -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Apr 4 19:56:56 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 4 Apr 2018 18:56:56 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: <59DE0638-07DC-4C10-A981-F4EEE6A60D89@nuance.com> Short answer is that if you want to change/remove the NSD server config on an NSD and its part of a file systems, you need to remove it from the file system or unmount the file system. *Thankfully* this is changed in Scale 5.0. In your case (host name change) ? if the IP address of the NSD server stays the same you *may* be OK. Can you put a DNS alias in for the old host name? Well, now that I think about it the old host name will stick around in the config ? so maybe not such a great idea. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hearns Reply-To: gpfsug main discussion list Date: Wednesday, April 4, 2018 at 1:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Dual server NSDs Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Wed Apr 4 11:02:09 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 10:02:09 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Message-ID: Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6---> RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Wed Apr 4 20:59:56 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 04 Apr 2018 15:59:56 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <49633.1522871996@turing-police.cc.vt.edu> On Wed, 04 Apr 2018 10:02:09 -0000, John Hearns said: > Has anyone done a procedure like this? We recently got to rename all 10 nodes in a GPFS cluster to make the unqualified name unique (turned out that having 2 nodes called 'arnsd1.isb.mgt' and 'arnsd1.vtc.mgt' causes all sorts of confusion). So they got renamed to arnsd1-isb.yadda.yadda and arnsd1-vtc.yadda.yadda. Unmount, did the mmchnsd server list thing, start going through the servers, rename and reboot each one. We did hit a whoopsie because I forgot to fix the list of quorum/manager nodes as we did each node - so don't forget to run mmchnode for each system if/when appropriate... From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 21:50:13 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 20:50:13 +0000 Subject: [gpfsug-discuss] Local event Message-ID: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 19:52:33 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 18:52:33 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: References: Message-ID: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Hi John, Yes, you can remove one of the servers and yes, we?ve done it and yes, the documentation is clear and correct. ;-) Last time I did this we were in a full cluster downtime, so unmounting wasn?t an issue. We were changing our network architecture and so the IP addresses of all NSD servers save one were changing. It was a bit ? uncomfortable ? for the brief period of time I had to make the one NSD server the one and only NSD server for ~1 PB of storage! But it worked just fine? HTHAL? Kevin On Apr 4, 2018, at 4:11 AM, John Hearns > wrote: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance ?If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command.? -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alevin at gmail.com Wed Apr 4 22:39:08 2018 From: alevin at gmail.com (Alex Levin) Date: Wed, 04 Apr 2018 21:39:08 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> References: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Message-ID: We are doing the similar procedure right now. Migrating from one group of nsd servers to another. Unfortunately, as I understand, if you can't afford the cluster/filesystem downtime and not ready for 5.0 upgrade yet ( personally I'm not comfortable with ".0" versions of software in production :) ) - the only way to do it is remove disk/nsd from filesystem and add it back with the new servers list. Taking a while , a lot of i/o ... John, in case the single nsd filesystem, I'm afraid, you'll have to unmount it to change .... --Alex On Wed, Apr 4, 2018, 2:25 PM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi John, > > Yes, you can remove one of the servers and yes, we?ve done it and yes, the > documentation is clear and correct. ;-) > > Last time I did this we were in a full cluster downtime, so unmounting > wasn?t an issue. We were changing our network architecture and so the IP > addresses of all NSD servers save one were changing. It was a bit ? > uncomfortable ? for the brief period of time I had to make the one NSD > server the one and only NSD server for ~1 PB of storage! But it worked > just fine? > > HTHAL? > > Kevin > > On Apr 4, 2018, at 4:11 AM, John Hearns wrote: > > I should say I already have a support ticket open for advice on this issue. > We have a filesystem which has NSDs which have two servers defined, for > instance: > nsd: > device=/dev/sdb > servers=sn007,sn008 > nsd=nsd1 > usage=dataOnly > > Can I remove one of these servers? The object is to upgrade this server > and change its hostname, the physical server will stay in place. > Has anyone carried out an operation similar to this? > > I guess the documentation here is quite clear: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance > > ?If you want to change configuration for a NSD which is already belongs > to a file system, you need to unmount the file system before running > mmchnsd command.? > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Thu Apr 5 02:57:15 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 03:57:15 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: Hm, you can change the host name of a Scale node. I've done that a while ago on one or two clusters. >From what I remember I'd follow these steps: 1. Upgrade the OS configuring/using the old IP addr/hostname (2. Reinstall Scale) (3. Replay the cluster data on the node) 4. Create an interface with the new IP address on the node (not necessarily connected) 5. Ensure the node is not required for quorum and has currently no mgr role. You might want to stop Scale on the node. 5. mmchnode -N --daemon-interface ; mmchnode -N --admin-interface . Now the node has kind of disappeared, if the new IF is not yet functional, until you bring that IF up (6.) (6. Activate connection to other cluster nodes via new IF) 2. and 3. are required if scale was removed / the system was re-set up from scratch 6. is required if the new IP connection config.ed in 4 is not operational at first (e.g. not yet linked, or routing not yet active, ...) Et voila, the server should be happy again, if stopped before, start up Scale and check. No warranties, But that's how I'd try. As usual: if messing with IP config, be sure to have a back door to the system in case you ground the OS network config . Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: John Hearns To: gpfsug main discussion list Date: 04/04/2018 21:33 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Sent by: gpfsug-discuss-bounces at spectrumscale.org Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6-? RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Thu Apr 5 03:25:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 04:25:18 +0200 Subject: [gpfsug-discuss] Local event In-Reply-To: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> References: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Message-ID: Hi Kevin , I suppose the quota check is done when the writing node allocates blocks to write to. mind: the detour via NSD servers is transparent for that layer, GPFS may switch between SCSI/SAN paths to a (direct-.attached) block device and the NSD service via a separate NSD server, both ways are logically similar for the writing node (or should be for your matter). In short: yes, I think you need to roll out your "quota exceeded" call-back to all nodes in the HPC cluster. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 04/04/2018 22:51 Subject: [gpfsug-discuss] Local event Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Thu Apr 5 10:30:22 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Thu, 05 Apr 2018 10:30:22 +0100 Subject: [gpfsug-discuss] RFE Process ... Burning Issues Message-ID: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 11:09:07 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 10:09:07 +0000 Subject: [gpfsug-discuss] UK April meeting Message-ID: It?s now just two weeks until the UK meeting and we are down to our last few places available. If you were planning on attending, please register now! Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 1 March 2018 at 11:26 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] UK April meeting Hi All, We?ve just posted the draft agenda for the UK meeting in April at: http://www.spectrumscaleug.org/event/uk-2018-user-group-event/ So far, we?ve issued over 50% of the available places, so if you are planning to attend, please do register now! Please register at: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-2018-registration-41489952565?aff=MailingList We?ve also confirmed our evening networking/social event between days 1 and 2 with thanks to our sponsors for supporting this. Please remember that we are currently limiting to two registrations per organisation. We?d like to thank our sponsors from DDN, E8, Ellexus, IBM, Lenovo, NEC and OCF for supporting the event. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Apr 5 14:37:35 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 5 Apr 2018 09:37:35 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 15:27:38 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 14:27:38 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: on behalf of "makaplan at us.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Thu Apr 5 16:27:52 2018 From: john.hearns at asml.com (John Hearns) Date: Thu, 5 Apr 2018 15:27:52 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Thankyou everyone for replies on this issue. Very helpful. We have a test setup with three nodes, although no multi-pathed disks. So I can try out removing and replacing disks servers. I agree with Simon that bringing up a second NIC is probably inviting Murphy in to play merry hell? The option we are envisioning is re-installing the server(s) but leaving them with the existing FQDNs if we can. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Thursday, April 05, 2018 4:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: > on behalf of "makaplan at us.ibm.com" > Reply-To: "gpfsug-discuss at spectrumscale.org" > Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Fri Apr 6 00:12:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Fri, 6 Apr 2018 01:12:52 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Hi John, some last thoughts mmdelnode/mmaddnode is an easy way to move non-NSD servers, but doing so for NSD servers requires to run mmchnsd, and that again requires a downtime for the file system the NSDs are part of (in Scale 4 at least, what we are talking right here). That could only be circumvented by mmdeldisk/mmadddisk the NSDs of the NSD server to be moved (with all the restriping). If that's ok for you go ahead. Else I think you might give the mmchnode way a second thought. I'd stop GPFS on the server to be moved (although that should also be hot-swappable) which should prevent any havoc for Scale and offers you plenty of opportunity to check your final new network set-up, before starting Scale on that renewed node. YMMV, and you might try different methods on your test system of course. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From sjhoward at iu.edu Fri Apr 6 16:20:59 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:20:59 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5 -> 4.x Message-ID: <1523028060.8115.12.camel@iu.edu> Hi All, We were wondering what the group's experiences have been with upgrading export nodes from 3.5, especially those upgrades that involved a transition from home-grown ADS domain integration to the new CES integration piece. Specifcially, we're interested in: 1) ?What changes were necessary to make in your domain to get it to interoperate with CES? 2) ?Any good tips for CES workarounds in the case of domain configuration that cannot be changed? 3) ?Experience with CES user-defined auth mode in particular? ?Has anyone got this mode to work successfullly? Let us know. ?Thanks! Stewart Howard Indiana University From sjhoward at iu.edu Fri Apr 6 16:14:48 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:14:48 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Message-ID: <1523027688.8115.6.camel@iu.edu> Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart From r.sobey at imperial.ac.uk Fri Apr 6 17:00:09 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 6 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: Hi Stewart We're running a synchronous replication cluster between our DCs in London and Slough, at a distance of ~63km. The latency is in the order of 700 microseconds over dark fibre. Honestly... it's been a fine experience. We've never had a full connectivity loss mind you, but we have had to shut down one site fully whilst the other one carried on as normal. Mmrestripe afterwards of course. We are running Scale version 4.2.3 and looking at v5. Hardware is IBM v3700 storage, IBM rackmount NSD/CES nodes. The storage is connected via FC. Cheers Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Howard, Stewart Jameson Sent: 06 April 2018 16:15 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Apr 6 18:42:53 2018 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 6 Apr 2018 17:42:53 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5-> 4.x In-Reply-To: <1523028060.8115.12.camel@iu.edu> References: <1523028060.8115.12.camel@iu.edu> Message-ID: An HTML attachment was scrubbed... URL: From YARD at il.ibm.com Sat Apr 7 18:27:49 2018 From: YARD at il.ibm.com (Yaron Daniel) Date: Sat, 7 Apr 2018 20:27:49 +0300 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: HI We have few customers than have 2 Sites (Active/Active using SS replication) + 3rd site as Quorum Tie Breaker node. 1) Spectrum Scale 4.2.3.x 2) Lenovo x3650 -M4 connect via FC to SVC (Flash900 as external storage) 3) We run all tests before deliver the system to customer Production. Main items to take into account : 1) What is the latecny you have between the 2 main sites ? 2) What network bandwidth between the 2 sites ? 3) What is the latency to the 3rd site from each site ? 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? 5) Do you plan to use dedicated network for GPFS daemon ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Howard, Stewart Jameson" To: "gpfsug-discuss at spectrumscale.org" Date: 04/06/2018 06:24 PM Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. Specifically, we're interested in: 1) What SS version are you running? 2) What hardware are you running it on? 3) What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=yYIveWTR3gNyhJ9KsrodpWApBlpQ29Oi858MuE0Nzsw&s=V42UYnHtEYVK3LvH6i930tzte1qp0sWmiY6Pp1Ep3kg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 11294 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Sun Apr 8 17:21:34 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Sun, 08 Apr 2018 12:21:34 -0400 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: References: <1523027688.8115.6.camel@iu.edu> Message-ID: <230460.1523204494@turing-police.cc.vt.edu> On Sat, 07 Apr 2018 20:27:49 +0300, "Yaron Daniel" said: > Main items to take into account : > 1) What is the latecny you have between the 2 main sites ? > 2) What network bandwidth between the 2 sites ? > 3) What is the latency to the 3rd site from each site ? > 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? > 5) Do you plan to use dedicated network for GPFS daemon ? The answers to most of these questions are a huge "it depends". For instance, the bandwidth needed is dictated by the amount of data being replicated. The cluster I mentioned the other day was filling most of a 10Gbit link while we were importing 5 petabytes of data from our old archive solution, but now often fits its replication needs inside a few hundred mbits/sec. Similarly, the answers to (4) and (5) will depend on what long-haul network infrastructure the customer already has or can purchase. If they have layer 2 capability between the sites, that's an option. If they've just got commodity layer-3, you're designing with layer 3 in mind. If their network has VLAN capability between the sites, or a dedicated link, that will affect the answer for (5). And so on... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From vpuvvada at in.ibm.com Mon Apr 9 05:52:56 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 9 Apr 2018 10:22:56 +0530 Subject: [gpfsug-discuss] AFM-DR Questions In-Reply-To: References: Message-ID: Hi, > - Any reason why we changed the Recovery point objective (RPO) snapshots by 15 minutes to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? AFM DR doesn't require RPO snapshots for replication, it is continuous replication. Unless there is a need for crash consistency snapshots (applications like databases need write ordering), RPO interval 15 minutes simply puts load on the system as they have to created and deleted for every 15 minutes. >- Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? Yes, command "mmpsnap --rpo" can be used to create RPO snapshots. Some users disable RPO on filesets and cron job is used to create RPO snapshots based on requirement. >- In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? If applications can failover to live filesystem, it is not required to restore from the snapshot. Applications which needs crash consistency will restore from the latest snapshot during failover. AFM DR maintains at most 2 RPO snapshots. >- According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are under an existing Independent fileset. Do you see any issues/concerns with this? AFM DR doesn't support dependent filesets. Users won't be allowed to create them or convert to AFM DR fileset if they already exists. ~Venkat (vpuvvada at in.ibm.com) From: "Delmar Demarchi" To: gpfsug-discuss at spectrumscale.org Date: 03/29/2018 07:12 PM Subject: [gpfsug-discuss] AFM-DR Questions Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello experts. We have a Scale project with AFM-DR to be implemented and after read the KC documentation, we have some questions about. - Do you know any reason why we changed the Recovery point objective (RPO) snapshots by 15 to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? - Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? - In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? - According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are part of an existing Independent fileset. Do you see any issues/concerns with this? Thank you in advance. Delmar Demarchi .'. (delmard at br.ibm.com)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ERiLT5aa1e1r1QyLkokJhA1Q5frqqgQ-g90JT0MGQvQ&s=KVjGaS1dG0luvtm0yh4rBpKNbUquTGuf2FSmaNBIOIM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 9 10:00:26 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 09:00:26 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly Message-ID: Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the "Max Inodes" column. I've verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.roth at de.ibm.com Mon Apr 9 10:38:48 2018 From: stefan.roth at de.ibm.com (Stefan Roth) Date: Mon, 9 Apr 2018 11:38:48 +0200 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. |-----------------+----------------------+-------------------------------------------+---------+> |Mit freundlichen | | | || |Gr??en / Kind | | | || |regards | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Stefan Roth | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Spectrum Scale | | | || |GUI Development | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Phone: |+49-7034-643-1362 | IBM Deutschland | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |E-Mail: |stefan.roth at de.ibm.com| Am Weiher 24 | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | 65451 Kelsterbach | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | Germany | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |IBM Deutschland | | | || |Research & | | | || |Development | | | || |GmbH / | | | || |Vorsitzender des | | | || |Aufsichtsrats: | | | || |Martina Koederitz| | | || | | | | || |Gesch?ftsf?hrung:| | | || |Dirk Wittkopp | | | || |Sitz der | | | || |Gesellschaft: | | | || |B?blingen / | | | || |Registergericht: | | | || |Amtsgericht | | | || |Stuttgart, HRB | | | || |243294 | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19702371.gif Type: image/gif Size: 156 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19171259.gif Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19868035.gif Type: image/gif Size: 63 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Apr 9 11:19:50 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 10:19:50 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Thanks Stefan, very interesting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3CFF4.ABF16450] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3CFF4.ABF16450] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3CFF4.ABF16450] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From john.hearns at asml.com Mon Apr 9 15:43:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:43:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so In-Reply-To: References: Message-ID: And I have fixed my own issue... In the chroot environment: mount -t proc /proc /proc Rookie mistake. Head hung in shame. But I beg forgiveness. My first comps Sci lecturer, Jennifer Haselgrove at Glasgow, taught us an essential programming technique on day one. Always discuss your program with your cat. Sit down with him or her, and talk them through the algorithm, and any bugs which you have. It is a very effective technique. I thank you all for being stand-in cats. As an aside, I will not be at the London meeting next week. Would be good to put some faces to names, and to seek out beer. I am sure IBMers can point you all in the correct direction for that. From: John Hearns Sent: Monday, April 09, 2018 4:37 PM To: gpfsug main discussion list Subject: Installer cannot find libdbgwrapper70.so I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Mon Apr 9 15:37:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:37:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so Message-ID: I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Apr 9 18:17:52 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 9 Apr 2018 17:17:52 +0000 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Message-ID: Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Apr 9 18:20:38 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 9 Apr 2018 17:20:38 +0000 Subject: [gpfsug-discuss] Reminder: SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Only a little over a month away! The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. W have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Mon Apr 9 23:51:05 2018 From: nick.savva at adventone.com (Nick Savva) Date: Mon, 9 Apr 2018 22:51:05 +0000 Subject: [gpfsug-discuss] Device mapper Message-ID: Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Apr 10 01:04:12 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 10 Apr 2018 00:04:12 +0000 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <6c952e81c58940a19114ee1c976501e0@jumptrading.com> Hi Nick, You are correct. You need to update the nsddevices file to look in /dev/mapper. Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Nick Savva Sent: Monday, April 09, 2018 5:51 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Device mapper Note: External Email ________________________________ Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Tue Apr 10 03:27:13 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 10 Apr 2018 02:27:13 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Message-ID: Claire/ Richard et al. The link works for me also, but I agree that the URL is complex and ugly. I am sure there must be a simpler URL with less embedded metadata that could be used? eg. Cutting it down to this appears to still work: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 3 Apr 2018, at 04:56, Secretary GPFS UG wrote: > > Hi Richard, > > My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP > > If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com > > Many thanks, > > --- > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org >> On , Richard Booth wrote: >> >> Hi Claire >> >> The link at the bottom of your email, doesn't appear to be working. >> >> Richard >> >>> On 3 April 2018 at 12:00, wrote: >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Transforming Workflows at Scale (Secretary GPFS UG) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Tue, 03 Apr 2018 11:41:41 +0100 >>> From: Secretary GPFS UG >>> To: gpfsug main discussion list >>> Subject: [gpfsug-discuss] Transforming Workflows at Scale >>> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >>> Content-Type: text/plain; charset="us-ascii" >>> >>> >>> >>> Dear all, >>> >>> There's a Spectrum Scale for media breakfast briefing event being >>> organised by IBM at IBM South Bank, London on 17th April (the day before >>> the next UK meeting). >>> >>> The event has been designed for broadcasters, post production houses and >>> visual effects organisations, where managing workflows between different >>> islands of technology is a major challenge. >>> >>> If you're interested, you can read more and register at the IBM >>> Registration Page [1]. >>> >>> Thanks, >>> -- >>> >>> Claire O'Toole >>> Spectrum Scale/GPFS User Group Secretary >>> +44 (0)7508 033896 >>> www.spectrumscaleug.org >>> >>> >>> Links: >>> ------ >>> [1] >>> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 2 >>> ********************************************* >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=3aZjjrv3ym45au9B33YgmVP51qvaHXYad4WRjccMOdk&s=rnsXK8Eibl0HLAElxCQexfrV8ReoB8hOYlkk3PmhqN4&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From tortay at cc.in2p3.fr Tue Apr 10 06:51:25 2018 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Tue, 10 Apr 2018 07:51:25 +0200 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <0b9f3629-146f-3720-fda8-3d51c0c37614@cc.in2p3.fr> On 10/04/2018 00:51, Nick Savva wrote: > Hi all, > > Apologies in advance if this has been covered already in discussions. > > I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. > > I understand you can also copy the bindings file but I think aliases is probably easier to maintain. > > However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? > > Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? > Hello, We're doing this, indeed, using the "nsddevices" script. The names printed by the script must be relative to "/dev". Our script contains the following (our multipath aliases are "nsdXY"): cd /dev && for nsd in mapper/nsd* ; do [ -e $nsd ] && echo "$nsd dmm" done return 0 The meaning of "dmm" is described in "/usr/lpp/mmfs/bin/mmdevdiscover". Lo?c. -- | Lo?c Tortay - IN2P3 Computing Centre | From rohwedder at de.ibm.com Tue Apr 10 08:57:44 2018 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Tue, 10 Apr 2018 09:57:44 +0200 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error In-Reply-To: References: Message-ID: Hello Kevin, it could be that the "hysteresis" parameter is still set to a non zero value. You can check by using the mmhealth thresholds list --verbose command, or of course by using the Monitor>Thresholds page. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 7034 6430190 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 09.04.2018 19:18 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l6AoS-QQpHgDtZkWluGw6Lln0PEOyUeS1ujJR2o1Hjg&s=X6bQXF1YmSSq1QyOkQXHYF1NMhczdJSPtWL4fpjbZ24&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1A990285.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Apr 10 09:55:30 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 10 Apr 2018 08:55:30 +0000 Subject: [gpfsug-discuss] CES SMB export limit Message-ID: Is there a limit to the number of SMB exports we can create in CES? Figures being thrown around here suggest 256 but we'd like to know for sure. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroche at lenovo.com Tue Apr 10 11:13:49 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:13:49 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi Claire, Can I add a registration to the Lenovo listing please? One of our technical architects from Israel would like to attend the event. Can we add: Gilad Berman HPC Architect Lenovo EMEA [Phone]+972-52-2554262 [Email]gberman at lenovo.com To the Attendee list? Thanks, Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Secretary GPFS UG Sent: Tuesday, April 3, 2018 11:42 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Transforming Workflows at Scale Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 7770 bytes Desc: image004.gif URL: From jroche at lenovo.com Tue Apr 10 11:30:37 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:30:37 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi All, sorry for the spam?. Finger troubles. ? Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 92 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 128 bytes Desc: image006.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.gif Type: image/gif Size: 7770 bytes Desc: image007.gif URL: From carlz at us.ibm.com Tue Apr 10 16:33:54 2018 From: carlz at us.ibm.com (Carl Zetie) Date: Tue, 10 Apr 2018 15:33:54 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com From aaron.s.knister at nasa.gov Tue Apr 10 17:00:09 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Tue, 10 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior Message-ID: I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:22:46 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:22:46 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: I wonder if this is an artifact of pagepool exhaustion which makes me ask the question-- how do I see how much of the pagepool is in use and by what? I've looked at mmfsadm dump and mmdiag --memory and neither has provided me the information I'm looking for (or at least not in a format I understand). -Aaron On 4/10/18 12:00 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] wrote: > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Tue Apr 10 17:28:29 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 10 Apr 2018 12:28:29 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cphoffma at uoregon.edu Tue Apr 10 17:18:49 2018 From: cphoffma at uoregon.edu (Chris Hoffman) Date: Tue, 10 Apr 2018 16:18:49 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523377129792.79060@uoregon.edu> ?Hi Stumped, Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes to the same file or different ones? Chris ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Tuesday, April 10, 2018 9:00 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Confusing I/O Behavior I hate admitting this but I've found something that's got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don't appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don't understand is why these write requests aren't getting batched up into larger write requests to the underlying disks. If I do something like "df if=/dev/zero of=foo bs=8k" on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn't doing any fsync's and isn't doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:52:30 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:52:30 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523377129792.79060@uoregon.edu> References: <1523377129792.79060@uoregon.edu> Message-ID: Chris, The job runs across multiple nodes and the tinky 8K writes *should* be to different files that are unique per-rank. -Aaron On 4/10/18 12:18 PM, Chris Hoffman wrote: > ?Hi Stumped, > > > Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes > to the same file or different ones? > > > Chris > > ------------------------------------------------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org > on behalf of Knister, Aaron > S. (GSFC-606.2)[COMPUTER SCIENCE CORP] > *Sent:* Tuesday, April 10, 2018 9:00 AM > *To:* gpfsug main discussion list > *Subject:* [gpfsug-discuss] Confusing I/O Behavior > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From UWEFALKE at de.ibm.com Tue Apr 10 22:43:30 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 10 Apr 2018 23:43:30 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Hi Aaron, to how many different files do these tiny I/O requests go? Mind that the write aggregates the I/O over a limited time (5 secs or so) and ***per file***. It is for that matter a large difference to write small chunks all to one file or to a large number of individual files . to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes ***per file*** within those 5 secs. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" To: gpfsug main discussion list Date: 10/04/2018 18:09 Subject: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Wed Apr 11 09:22:12 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 11 Apr 2018 08:22:12 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Just the 1000 SMB shares limit was what I wanted but the other info was useful, thanks Carl. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie Sent: 10 April 2018 16:34 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES SMB export limit Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Wed Apr 11 11:14:21 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 11 Apr 2018 11:14:21 +0100 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523441661.19449.153.camel@strath.ac.uk> On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > Hi Aaron,? > to how many different files do these tiny I/O requests go? > > Mind that the write aggregates the I/O over a limited time (5 secs or > so)?and ***per file***.? > It is for that matter a large difference to write small chunks all to > one? > file or to a large number of individual files . > to fill a??1 MiB buffer you need about 13100 chunks of??80Bytes > ***per? > file*** within those 5 secs.? > Something else to bear in mind is that you might be using a library that converts everything into putchar's. I have seen this in the past with Office on a Mac platform and made performance saving a file over SMB/NFS appalling. I mean really really bad, a?"save as" which didn't do that would take a second or two, a save would take like 15 minutes. To the local disk it was just fine. The GPFS angle is this was all on a self rolled clustered Samba GPFS setup back in the day. Took a long time to track down, and performance turned out to be just as appalling with a real Windows file server. JAB. -- Jonathan A. Buzzard?????????????????????????Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From UWEFALKE at de.ibm.com Wed Apr 11 11:53:36 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 11 Apr 2018 12:53:36 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523441661.19449.153.camel@strath.ac.uk> References: <1523441661.19449.153.camel@strath.ac.uk> Message-ID: It would be interesting in which chunks data arrive at the NSDs -- if those chunks are bigger than the individual I/Os (i.e. multiples of the record sizes), there is some data coalescing going on and it just needs to have its path well paved ... If not, there might be indeed something odd in the configuration. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 gpfsug-discuss-bounces at spectrumscale.org wrote on 11/04/2018 12:14:21: > From: Jonathan Buzzard > To: gpfsug main discussion list > Date: 11/04/2018 12:14 > Subject: Re: [gpfsug-discuss] Confusing I/O Behavior > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > > Hi Aaron, > > to how many different files do these tiny I/O requests go? > > > > Mind that the write aggregates the I/O over a limited time (5 secs or > > so) and ***per file***. > > It is for that matter a large difference to write small chunks all to > > one > > file or to a large number of individual files . > > to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes > > ***per > > file*** within those 5 secs. > > > > Something else to bear in mind is that you might be using a library > that converts everything into putchar's. I have seen this in the past > with Office on a Mac platform and made performance saving a file over > SMB/NFS appalling. I mean really really bad, a "save as" which didn't > do that would take a second or two, a save would take like 15 minutes. > To the local disk it was just fine. > > The GPFS angle is this was all on a self rolled clustered Samba GPFS > setup back in the day. Took a long time to track down, and performance > turned out to be just as appalling with a real Windows file server. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Wed Apr 11 12:06:40 2018 From: peserocka at gmail.com (Peter Serocka) Date: Wed, 11 Apr 2018 13:06:40 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Wed Apr 11 12:21:04 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 11 Apr 2018 12:21:04 +0100 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hi All, At the UK meeting next week, we?ve had a speaker slot become available, we?re planning to put in a BoF type session on tooling Spectrum Scale so we have space for a few 3-5 minute quick talks on what people are doing to automate. If you are coming along and interested, please drop me an email. Max of 3 slides! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Wed Apr 11 15:36:29 2018 From: valleru at cbio.mskcc.org (Lohit Valleru) Date: Wed, 11 Apr 2018 10:36:29 -0400 Subject: [gpfsug-discuss] GPFS, MMAP and Pagepool In-Reply-To: References: Message-ID: Hey Sven, This is regarding mmap issues and GPFS. We had discussed previously of experimenting with GPFS 5. I now have upgraded all of compute nodes and NSD nodes to GPFS 5.0.0.2 I am yet to experiment with mmap performance, but before that - I am seeing weird hangs with GPFS 5 and I think it could be related to mmap. Have you seen GPFS ever hang on this syscall? [Tue Apr 10 04:20:13 2018] [] _ZN10gpfsNode_t8mmapLockEiiPKj+0xb5/0x140 [mmfs26] I see the above ,when kernel hangs and throws out a series of trace calls. I somehow think the above trace is related to processes hanging on GPFS forever. There are no errors in GPFS however. Also, I think the above happens only when the mmap threads go above a particular number. We had faced a similar issue in 4.2.3 and it was resolved in a patch to 4.2.3.2 . At that time , the issue happened when mmap threads go more than worker1threads. According to the ticket - it was a mmap race condition that GPFS was not handling well. I am not sure if this issue is a repeat and I am yet to isolate the incident and test with increasing number of mmap threads. I am not 100 percent sure if this is related to mmap yet but just wanted to ask you if you have seen anything like above. Thanks, Lohit On Feb 22, 2018, 3:59 PM -0500, Sven Oehme , wrote: > Hi Lohit, > > i am working with ray on a mmap performance improvement right now, which most likely has the same root cause as yours , see -->??http://gpfsug.org/pipermail/gpfsug-discuss/2018-January/004411.html > the thread above is silent after a couple of back and rorth, but ray and i have active communication in the background and will repost as soon as there is something new to share. > i am happy to look at this issue after we finish with ray's workload if there is something missing, but first let's finish his, get you try the same fix and see if there is something missing. > > btw. if people would share their use of MMAP , what applications they use (home grown, just use lmdb which uses mmap under the cover, etc) please let me know so i get a better picture on how wide the usage is with GPFS. i know a lot of the ML/DL workloads are using it, but i would like to know what else is out there i might not think about. feel free to drop me a personal note, i might not reply to it right away, but eventually. > > thx. sven > > > > On Thu, Feb 22, 2018 at 12:33 PM wrote: > > > Hi all, > > > > > > I wanted to know, how does mmap interact with GPFS pagepool with respect to filesystem block-size? > > > Does the efficiency depend on the mmap read size and the block-size of the filesystem even if all the data is cached in pagepool? > > > > > > GPFS 4.2.3.2 and CentOS7. > > > > > > Here is what i observed: > > > > > > I was testing a user script that uses mmap to read from 100M to 500MB files. > > > > > > The above files are stored on 3 different filesystems. > > > > > > Compute nodes - 10G pagepool and 5G seqdiscardthreshold. > > > > > > 1. 4M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > 2. 1M block size GPFS filesystem as a AFM cache cluster, "with all the required files fully cached" from the above GPFS cluster as home. Data and Metadata together on SSDs > > > 3. 16M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > > > > When i run the script first time for ?each" filesystem: > > > I see that GPFS reads from the files, and caches into the pagepool as it reads, from mmdiag -- iohist > > > > > > When i run the second time, i see that there are no IO requests from the compute node to GPFS NSD servers, which is expected since all the data from the 3 filesystems is cached. > > > > > > However - the time taken for the script to run for the files in the 3 different filesystems is different - although i know that they are just "mmapping"/reading from pagepool/cache and not from disk. > > > > > > Here is the difference in time, for IO just from pagepool: > > > > > > 20s 4M block size > > > 15s 1M block size > > > 40S 16M block size. > > > > > > Why do i see a difference when trying to mmap reads from different block-size filesystems, although i see that the IO requests are not hitting disks and just the pagepool? > > > > > > I am willing to share the strace output and mmdiag outputs if needed. > > > > > > Thanks, > > > Lohit > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Apr 11 17:51:33 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 11 Apr 2018 16:51:33 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From makaplan at us.ibm.com Wed Apr 11 18:23:02 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 11 Apr 2018 13:23:02 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Good point about "tiny" files going into the inode and system pool. Which reminds one: Generally a bad idea to store metadata in wide striping disk base RAID (Type 5 with spinning media) Do use SSD or similar for metadata. Consider smaller block size for metadata / system pool than regular file data. From: Bryan Banister To: gpfsug main discussion list Date: 04/11/2018 12:51 PM Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 13 21:05:53 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 13 Apr 2018 20:05:53 +0000 Subject: [gpfsug-discuss] Replicated and non replicated data Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC@bham.ac.uk> I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon [root at nsd01 ~]# mmlsdisk castles -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- CASTLES_GPFS_DESCONLY01 nsd 512 310 no no ready up 1 system desc stg01-01_3_3 nsd 4096 210 no yes ready down 4 6tnlsas stg01-01_4_4 nsd 4096 210 no yes ready down 5 6tnlsas stg01-01_5_5 nsd 4096 210 no yes ready down 6 6tnlsas stg01-01_6_6 nsd 4096 210 no yes ready down 7 6tnlsas stg01-01_7_7 nsd 4096 210 no yes ready down 8 6tnlsas stg01-01_8_8 nsd 4096 210 no yes ready down 9 6tnlsas stg01-01_9_9 nsd 4096 210 no yes ready down 10 6tnlsas stg01-01_10_10 nsd 4096 210 no yes ready down 11 6tnlsas stg01-01_11_11 nsd 4096 210 no yes ready down 12 6tnlsas stg01-01_12_12 nsd 4096 210 no yes ready down 13 6tnlsas stg01-01_13_13 nsd 4096 210 no yes ready down 14 6tnlsas stg01-01_14_14 nsd 4096 210 no yes ready down 15 6tnlsas stg01-01_15_15 nsd 4096 210 no yes ready down 16 6tnlsas stg01-01_16_16 nsd 4096 210 no yes ready down 17 6tnlsas stg01-01_17_17 nsd 4096 210 no yes ready down 18 6tnlsas stg01-01_18_18 nsd 4096 210 no yes ready down 19 6tnlsas stg01-01_19_19 nsd 4096 210 no yes ready down 20 6tnlsas stg01-01_20_20 nsd 4096 210 no yes ready down 21 6tnlsas stg01-01_21_21 nsd 4096 210 no yes ready down 22 6tnlsas stg01-01_ssd_54_54 nsd 4096 210 yes no ready down 23 system stg01-01_ssd_56_56 nsd 4096 210 yes no ready down 24 system stg02-01_0_0 nsd 4096 110 no yes ready up 25 6tnlsas stg02-01_1_1 nsd 4096 110 no yes ready up 26 6tnlsas stg02-01_2_2 nsd 4096 110 no yes ready up 27 6tnlsas stg02-01_3_3 nsd 4096 110 no yes ready up 28 6tnlsas stg02-01_4_4 nsd 4096 110 no yes ready up 29 6tnlsas stg02-01_5_5 nsd 4096 110 no yes ready up 30 6tnlsas stg02-01_6_6 nsd 4096 110 no yes ready up 31 6tnlsas stg02-01_7_7 nsd 4096 110 no yes ready up 32 6tnlsas stg02-01_8_8 nsd 4096 110 no yes ready up 33 6tnlsas stg02-01_9_9 nsd 4096 110 no yes ready up 34 6tnlsas stg02-01_10_10 nsd 4096 110 no yes ready up 35 6tnlsas stg02-01_11_11 nsd 4096 110 no yes ready up 36 6tnlsas stg02-01_12_12 nsd 4096 110 no yes ready up 37 6tnlsas stg02-01_13_13 nsd 4096 110 no yes ready up 38 6tnlsas stg02-01_14_14 nsd 4096 110 no yes ready up 39 6tnlsas stg02-01_15_15 nsd 4096 110 no yes ready up 40 6tnlsas stg02-01_16_16 nsd 4096 110 no yes ready up 41 6tnlsas stg02-01_17_17 nsd 4096 110 no yes ready up 42 6tnlsas stg02-01_18_18 nsd 4096 110 no yes ready up 43 6tnlsas stg02-01_19_19 nsd 4096 110 no yes ready up 44 6tnlsas stg02-01_20_20 nsd 4096 110 no yes ready up 45 6tnlsas stg02-01_21_21 nsd 4096 110 no yes ready up 46 6tnlsas stg02-01_ssd_22_22 nsd 4096 110 yes no ready up 47 system desc stg02-01_ssd_23_23 nsd 4096 110 yes no ready up 48 system stg02-01_ssd_24_24 nsd 4096 110 yes no ready up 49 system stg02-01_ssd_25_25 nsd 4096 110 yes no ready up 50 system stg01-01_22_22 nsd 4096 210 no yes ready up 51 6tnlsasnonrepl desc stg01-01_23_23 nsd 4096 210 no yes ready up 52 6tnlsasnonrepl stg01-01_24_24 nsd 4096 210 no yes ready up 53 6tnlsasnonrepl stg01-01_25_25 nsd 4096 210 no yes ready up 54 6tnlsasnonrepl stg01-01_26_26 nsd 4096 210 no yes ready up 55 6tnlsasnonrepl stg01-01_27_27 nsd 4096 210 no yes ready up 56 6tnlsasnonrepl stg01-01_31_31 nsd 4096 210 no yes ready up 58 6tnlsasnonrepl stg01-01_32_32 nsd 4096 210 no yes ready up 59 6tnlsasnonrepl stg01-01_33_33 nsd 4096 210 no yes ready up 60 6tnlsasnonrepl stg01-01_34_34 nsd 4096 210 no yes ready up 61 6tnlsasnonrepl stg01-01_35_35 nsd 4096 210 no yes ready up 62 6tnlsasnonrepl stg01-01_36_36 nsd 4096 210 no yes ready up 63 6tnlsasnonrepl stg01-01_37_37 nsd 4096 210 no yes ready up 64 6tnlsasnonrepl stg01-01_38_38 nsd 4096 210 no yes ready up 65 6tnlsasnonrepl stg01-01_39_39 nsd 4096 210 no yes ready up 66 6tnlsasnonrepl stg01-01_40_40 nsd 4096 210 no yes ready up 67 6tnlsasnonrepl stg01-01_41_41 nsd 4096 210 no yes ready up 68 6tnlsasnonrepl stg01-01_42_42 nsd 4096 210 no yes ready up 69 6tnlsasnonrepl stg01-01_43_43 nsd 4096 210 no yes ready up 70 6tnlsasnonrepl stg01-01_44_44 nsd 4096 210 no yes ready up 71 6tnlsasnonrepl stg01-01_45_45 nsd 4096 210 no yes ready up 72 6tnlsasnonrepl stg01-01_46_46 nsd 4096 210 no yes ready up 73 6tnlsasnonrepl stg01-01_47_47 nsd 4096 210 no yes ready up 74 6tnlsasnonrepl stg01-01_48_48 nsd 4096 210 no yes ready up 75 6tnlsasnonrepl stg01-01_49_49 nsd 4096 210 no yes ready up 76 6tnlsasnonrepl stg01-01_50_50 nsd 4096 210 no yes ready up 77 6tnlsasnonrepl stg01-01_51_51 nsd 4096 210 no yes ready up 78 6tnlsasnonrepl Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Apr 13 21:17:11 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 13 Apr 2018 20:17:11 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data Message-ID: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From sxiao at us.ibm.com Sat Apr 14 02:42:28 2018 From: sxiao at us.ibm.com (Steve Xiao) Date: Fri, 13 Apr 2018 21:42:28 -0400 Subject: [gpfsug-discuss] Replicated and non replicated data In-Reply-To: References: Message-ID: What is your unmountOnDiskFail configuration setting on the cluster? You need to set unmountOnDiskFail to meta if you only have metadata replication. Steve Y. Xiao > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 13 Apr 2018 20:05:53 +0000 > From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > > Subject: [gpfsug-discuss] Replicated and non replicated data > Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > I have a question about file-systems with replicated an non replicated data. > > We have a file-system where metadata is set to copies=2 and data > copies=2, we then use a placement policy to selectively replicate > some data only once based on file-set. We also place the non- > replicated data into a specific pool (6tnlsas) to ensure we know > where it is placed. > > My understanding was that in doing this, if we took the disks with > the non replicated data offline, we?d still have the FS available > for users as the metadata is replicated. Sure accessing a non- > replicated data file would give an IO error, but the rest of the FS > should be up. > > We had a situation today where we wanted to take stg01 offline > today, so tried using mmchdisk stop -d ?. Once we got to about disk > stg01-01_12_12, GPFS would refuse to stop any more disks and > complain about too many disks, similarly if we shutdown the NSD > servers hosting the disks, the filesystem would have an SGPanic and > force unmount. > > First, am I correct in thinking that a FS with non-replicated data, > but replicated metadata should still be accessible (not the non- > replicated data) when the LUNS hosting it are down? > > If so, any suggestions why my FS is panic-ing when we take down the > one set of disks? > > I thought at first we had some non-replicated metadata, tried a > mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but > this didn?t help. > > Running 5.0.0.2 on the NSD server nodes. > > (First time we went round this we didn?t have a FS descriptor disk, > but you can see below that we added this) > > Thanks > > Simon > > [root at nsd01 ~]# mmlsdisk castles -L > disk driver sector failure holds holds storage > name type size group metadata data status > availability disk id pool remarks > ------------ -------- ------ ----------- -------- ----- > ------------- ------------ ------- ------------ --------- > CASTLES_GPFS_DESCONLY01 nsd 512 310 no no > ready up 1 system desc > stg01-01_3_3 nsd 4096 210 no yes ready > down 4 6tnlsas > stg01-01_4_4 nsd 4096 210 no yes ready > down 5 6tnlsas > stg01-01_5_5 nsd 4096 210 no yes ready > down 6 6tnlsas > stg01-01_6_6 nsd 4096 210 no yes ready > down 7 6tnlsas > stg01-01_7_7 nsd 4096 210 no yes ready > down 8 6tnlsas > stg01-01_8_8 nsd 4096 210 no yes ready > down 9 6tnlsas > stg01-01_9_9 nsd 4096 210 no yes ready > down 10 6tnlsas > stg01-01_10_10 nsd 4096 210 no yes ready > down 11 6tnlsas > stg01-01_11_11 nsd 4096 210 no yes ready > down 12 6tnlsas > stg01-01_12_12 nsd 4096 210 no yes ready > down 13 6tnlsas > stg01-01_13_13 nsd 4096 210 no yes ready > down 14 6tnlsas > stg01-01_14_14 nsd 4096 210 no yes ready > down 15 6tnlsas > stg01-01_15_15 nsd 4096 210 no yes ready > down 16 6tnlsas > stg01-01_16_16 nsd 4096 210 no yes ready > down 17 6tnlsas > stg01-01_17_17 nsd 4096 210 no yes ready > down 18 6tnlsas > stg01-01_18_18 nsd 4096 210 no yes ready > down 19 6tnlsas > stg01-01_19_19 nsd 4096 210 no yes ready > down 20 6tnlsas > stg01-01_20_20 nsd 4096 210 no yes ready > down 21 6tnlsas > stg01-01_21_21 nsd 4096 210 no yes ready > down 22 6tnlsas > stg01-01_ssd_54_54 nsd 4096 210 yes no ready > down 23 system > stg01-01_ssd_56_56 nsd 4096 210 yes no ready > down 24 system > stg02-01_0_0 nsd 4096 110 no yes ready > up 25 6tnlsas > stg02-01_1_1 nsd 4096 110 no yes ready > up 26 6tnlsas > stg02-01_2_2 nsd 4096 110 no yes ready > up 27 6tnlsas > stg02-01_3_3 nsd 4096 110 no yes ready > up 28 6tnlsas > stg02-01_4_4 nsd 4096 110 no yes ready > up 29 6tnlsas > stg02-01_5_5 nsd 4096 110 no yes ready > up 30 6tnlsas > stg02-01_6_6 nsd 4096 110 no yes ready > up 31 6tnlsas > stg02-01_7_7 nsd 4096 110 no yes ready > up 32 6tnlsas > stg02-01_8_8 nsd 4096 110 no yes ready > up 33 6tnlsas > stg02-01_9_9 nsd 4096 110 no yes ready > up 34 6tnlsas > stg02-01_10_10 nsd 4096 110 no yes ready > up 35 6tnlsas > stg02-01_11_11 nsd 4096 110 no yes ready > up 36 6tnlsas > stg02-01_12_12 nsd 4096 110 no yes ready > up 37 6tnlsas > stg02-01_13_13 nsd 4096 110 no yes ready > up 38 6tnlsas > stg02-01_14_14 nsd 4096 110 no yes ready > up 39 6tnlsas > stg02-01_15_15 nsd 4096 110 no yes ready > up 40 6tnlsas > stg02-01_16_16 nsd 4096 110 no yes ready > up 41 6tnlsas > stg02-01_17_17 nsd 4096 110 no yes ready > up 42 6tnlsas > stg02-01_18_18 nsd 4096 110 no yes ready > up 43 6tnlsas > stg02-01_19_19 nsd 4096 110 no yes ready > up 44 6tnlsas > stg02-01_20_20 nsd 4096 110 no yes ready > up 45 6tnlsas > stg02-01_21_21 nsd 4096 110 no yes ready > up 46 6tnlsas > stg02-01_ssd_22_22 nsd 4096 110 yes no ready > up 47 system desc > stg02-01_ssd_23_23 nsd 4096 110 yes no ready > up 48 system > stg02-01_ssd_24_24 nsd 4096 110 yes no ready > up 49 system > stg02-01_ssd_25_25 nsd 4096 110 yes no ready > up 50 system > stg01-01_22_22 nsd 4096 210 no yes ready > up 51 6tnlsasnonrepl desc > stg01-01_23_23 nsd 4096 210 no yes ready > up 52 6tnlsasnonrepl > stg01-01_24_24 nsd 4096 210 no yes ready > up 53 6tnlsasnonrepl > stg01-01_25_25 nsd 4096 210 no yes ready > up 54 6tnlsasnonrepl > stg01-01_26_26 nsd 4096 210 no yes ready > up 55 6tnlsasnonrepl > stg01-01_27_27 nsd 4096 210 no yes ready > up 56 6tnlsasnonrepl > stg01-01_31_31 nsd 4096 210 no yes ready > up 58 6tnlsasnonrepl > stg01-01_32_32 nsd 4096 210 no yes ready > up 59 6tnlsasnonrepl > stg01-01_33_33 nsd 4096 210 no yes ready > up 60 6tnlsasnonrepl > stg01-01_34_34 nsd 4096 210 no yes ready > up 61 6tnlsasnonrepl > stg01-01_35_35 nsd 4096 210 no yes ready > up 62 6tnlsasnonrepl > stg01-01_36_36 nsd 4096 210 no yes ready > up 63 6tnlsasnonrepl > stg01-01_37_37 nsd 4096 210 no yes ready > up 64 6tnlsasnonrepl > stg01-01_38_38 nsd 4096 210 no yes ready > up 65 6tnlsasnonrepl > stg01-01_39_39 nsd 4096 210 no yes ready > up 66 6tnlsasnonrepl > stg01-01_40_40 nsd 4096 210 no yes ready > up 67 6tnlsasnonrepl > stg01-01_41_41 nsd 4096 210 no yes ready > up 68 6tnlsasnonrepl > stg01-01_42_42 nsd 4096 210 no yes ready > up 69 6tnlsasnonrepl > stg01-01_43_43 nsd 4096 210 no yes ready > up 70 6tnlsasnonrepl > stg01-01_44_44 nsd 4096 210 no yes ready > up 71 6tnlsasnonrepl > stg01-01_45_45 nsd 4096 210 no yes ready > up 72 6tnlsasnonrepl > stg01-01_46_46 nsd 4096 210 no yes ready > up 73 6tnlsasnonrepl > stg01-01_47_47 nsd 4096 210 no yes ready > up 74 6tnlsasnonrepl > stg01-01_48_48 nsd 4096 210 no yes ready > up 75 6tnlsasnonrepl > stg01-01_49_49 nsd 4096 210 no yes ready > up 76 6tnlsasnonrepl > stg01-01_50_50 nsd 4096 210 no yes ready > up 77 6tnlsasnonrepl > stg01-01_51_51 nsd 4096 210 no yes ready > up 78 6tnlsasnonrepl > Number of quorum disks: 3 > Read quorum value: 2 > Write quorum value: 2 > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180413_c22c8133_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=2P7GHehsFTuGZ39pBTBsUzcdwo9jkidie2etD8_llas&e= > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=Q5EVJvSbunfieiHUrDHMpC3WAhP1fX2sQFwLLgLFb8Y&e= > > > End of gpfsug-discuss Digest, Vol 75, Issue 23 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 16 09:42:04 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 16 Apr 2018 08:42:04 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? (note it looks like we also have to set this in multi-cluster environments in client clusters as well) Simon From: "Robert.Oesterlin at nuance.com" Date: Friday, 13 April 2018 at 21:17 To: "gpfsug-discuss at spectrumscale.org" Cc: "Simon Thompson (IT Research Support)" Subject: Re: [Replicated and non replicated data Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 16 10:01:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 16 Apr 2018 09:01:41 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Just upgraded the GUI to 4.2.3.8, the bug is now fixed, thanks! Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3D569.E9989650] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3D569.E9989650] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3D569.E9989650] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From Robert.Oesterlin at nuance.com Mon Apr 16 12:34:36 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 16 Apr 2018 11:34:36 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: A DW post from Yuri a few years back talks about it: https://www.ibm.com/developerworks/community/forums/html/topic?id=4cebdb97-3052-4cf2-abb1-462660a1489c Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 From: "Simon Thompson (IT Research Support)" Date: Monday, April 16, 2018 at 3:43 AM To: "Oesterlin, Robert" , gpfsug main discussion list Subject: [EXTERNAL] Re: [Replicated and non replicated data Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Mon Apr 16 13:27:30 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Mon, 16 Apr 2018 13:27:30 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: <0d93fcd2f80d91ba958825c2bdd3d09d@webmail.gpfsug.org> Dear All, This event has been postponed and will now take place on 13TH JUNE. Details are on the link below: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [1] Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Secretary GPFS UG wrote: > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). > > The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Tue Apr 17 09:31:35 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 08:31:35 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz From valdis.kletnieks at vt.edu Tue Apr 17 16:27:51 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 11:27:51 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: <16306.1523978871@turing-police.cc.vt.edu> On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From matthew.robinson02 at gmail.com Tue Apr 17 19:03:57 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Tue, 17 Apr 2018 14:03:57 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Apr 17 19:24:04 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 17 Apr 2018 18:24:04 +0000 Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Tue Apr 17 19:26:18 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 14:26:18 -0400 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Message-ID: <41184.1523989578@turing-police.cc.vt.edu> So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From a.khiredine at meteo.dz Tue Apr 17 21:48:54 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 20:48:54 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 27 In-Reply-To: References: Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0C1A@SDEB-EXC02.meteo.dz> thank you for the answer I use lsscsi the disk is in place "Linux sees the disk" i check the enclosure the disk is in use "I sees the disk :)" and I check the disk the indicator of disk flashes green and I connect to the NetApp DE6600 disk enclosure over telnet and the disk is in place "NetApp sees the disk" if i use this CMD mmchpdisk BB1RGR --pdisk e2d3s11 --identify on Location of pdisk e2d3s11 is not known the only cmd that works is mmchpdisk --suspend OR --diagnose e2d3s11 0, 0 DA2 2560 GiB normal missing/noPath/systemDrain/noRGD/noVCD/noData is change from missing to diagnosing e2d3s11 0, 0 DA2 2560 GiB normal diagnosing/noPath/noVCD and after one or 2 min is change from diagnosing to missing e2d3s11 0, 0 DA2 2582 GiB replaceable missing/noPath/systemDrain/noRGD/noVCD the disk is in place the GNR/GSS/ESS can not see the disk if I can find the file or GNR/GSS/ESS stores the disk location I can add the path that is missing Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] de la part de gpfsug-discuss-request at spectrumscale.org [gpfsug-discuss-request at spectrumscale.org] Envoy? : mardi 17 avril 2018 19:24 ? : gpfsug-discuss at spectrumscale.org Objet : gpfsug-discuss Digest, Vol 75, Issue 27 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: GNR/GSS/ESS pdisk location (valdis.kletnieks at vt.edu) 2. Re: GNR/GSS/ESS pdisk location (Matthew Robinson) 3. Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma (Oesterlin, Robert) ---------------------------------------------------------------------- Message: 1 Date: Tue, 17 Apr 2018 11:27:51 -0400 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <16306.1523978871 at turing-police.cc.vt.edu> Content-Type: text/plain; charset="iso-8859-1" On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: ------------------------------ Message: 2 Date: Tue, 17 Apr 2018 14:03:57 -0400 From: Matthew Robinson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: Content-Type: text/plain; charset="utf-8" Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Tue, 17 Apr 2018 18:24:04 +0000 From: "Oesterlin, Robert" To: gpfsug main discussion list Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Content-Type: text/plain; charset="utf-8" The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 27 ********************************************** From scale at us.ibm.com Tue Apr 17 22:17:29 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 17 Apr 2018 14:17:29 -0700 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? In-Reply-To: <41184.1523989578@turing-police.cc.vt.edu> References: <41184.1523989578@turing-police.cc.vt.edu> Message-ID: Here is the link to our GPFS FAQ which list details on supported versions. https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#linux Search for "Table 30. IBM Spectrum Scale for Linux RedHat kernel support" and it lists the details that you are looking for. Thanks, Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: valdis.kletnieks at vt.edu To: gpfsug-discuss at spectrumscale.org Date: 04/17/2018 12:11 PM Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Sent by: gpfsug-discuss-bounces at spectrumscale.org So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") (See attached file: att7gkev.dat) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: att7gkev.dat Type: application/octet-stream Size: 497 bytes Desc: not available URL: From chair at spectrumscale.org Wed Apr 18 07:51:58 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 18 Apr 2018 07:51:58 +0100 Subject: [gpfsug-discuss] UK Group Live Streams Message-ID: <1228FBE5-7050-443F-9514-446C28683711@spectrumscale.org> Hi All, We?re hoping to have live streaming of today and some of tomorrow?s sessions from London, I?ll post links to the streams on the Spectrum Scale User Group web-site as we go through the day. Note this is the first year we?ll have tried this, so we?ll have to see how it goes! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Apr 18 13:34:22 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 18 Apr 2018 12:34:22 +0000 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Message-ID: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> While I don?t own a DeLorean I work with someone who once fixed one up, which I *think* effectively means I can jump back in time to before the deadline to submit. (And let?s be honest, with the way HPC is going it feels like we have the requisite 1.21GW of power...) However, since I can?t actually time travel back to last week, is there any possibility of an extension? On April 5, 2018 at 05:30:42 EDT, Simon Thompson (Spectrum Scale User Group Chair) wrote: Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 16:03:17 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 11:03:17 -0400 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> Message-ID: No, I think you'll have to find a working DeLorean, get in it and while traveling at 88 mph (141.622 kph) submit your email over an amateur packet radio network .... -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 16:54:45 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 11:54:45 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Message-ID: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From stockf at us.ibm.com Wed Apr 18 18:38:36 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 18 Apr 2018 13:38:36 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: Would the PATH_NAME LIKE option work? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 12:55 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=tM9JZXsRNu6EEhoFlUuWvTLwMsqbDjfDj3NDZ6elACA&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=V6u0XsNxHj4Mp-mu7hCZKv1AD3_GYqU-4KZzvMSQ_MQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 19:00:13 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 14:00:13 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: I suggest you remove any FOR FILESET(...) specifications from your rules and then run mmapplypolicy /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan ... --scope inodespace -P your-policy-rules-file ... See also the (RTFineM) for the --scope option and the Directory argument of the mmapplypolicy command. That is the best, most efficient way to scan all the files that are in a particular inode-space. Also, you must have all filesets of interest "linked" and the file system must be mounted. Notice that "independent" means that the fileset name is used to denote both a fileset and an inode-space, where said inode-space contains the fileset of that name and possibly other "dependent" filesets... IF one wished to search the entire file system for files within several different filesets, one could use rules with FOR FILESET('fileset1','fileset2','and-so-on') Or even more flexibly WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' Or even more powerfully WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 01:00 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 19:51:29 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 14:51:29 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Ok Marc and Frederick, there is hope. I'll conduct more experiments and report back Thanks for the suggestions. Jaime Quoting "Marc A Kaplan" : > I suggest you remove any FOR FILESET(...) specifications from your rules > and then run > > mmapplypolicy > /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan > ... --scope inodespace -P your-policy-rules-file ... > > See also the (RTFineM) for the --scope option and the Directory argument > of the mmapplypolicy command. > > That is the best, most efficient way to scan all the files that are in a > particular inode-space. Also, you must have all filesets of interest > "linked" and the file system must be mounted. > > Notice that "independent" means that the fileset name is used to denote > both a fileset and an inode-space, where said inode-space contains the > fileset of that name and possibly other "dependent" filesets... > > IF one wished to search the entire file system for files within several > different filesets, one could use rules with > > FOR FILESET('fileset1','fileset2','and-so-on') > > Or even more flexibly > > WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' > > Or even more powerfully > > WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') > > > > > > From: "Jaime Pinto" > To: "gpfsug main discussion list" > Date: 04/18/2018 01:00 PM > Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > A few months ago I asked about limits and dynamics of traversing > depended .vs independent filesets on this forum. I used the > information provided to make decisions and setup our new DSS based > gpfs storage system. Now I have a problem I couldn't' yet figure out > how to make it work: > > 'project' and 'scratch' are top *independent* filesets of the same > file system. > > 'proj1', 'proj2' are dependent filesets nested under 'project' > 'scra1', 'scra2' are dependent filesets nested under 'scratch' > > I would like to run a purging policy on all contents under 'scratch' > (which includes 'scra1', 'scra2'), and TSM backup policies on all > contents under 'project' (which includes 'proj1', 'proj2'). > > HOWEVER: > When I run the purging policy on the whole gpfs device (with both > 'project' and 'scratch' filesets) > > * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and > 'scra2' filesets under scratch are excluded (totally unexpected) > > * if I use FOR FILESET('scra1') I get error that scra1 is dependent > fileset (Ok, that is expected) > > * if I use /*FOR FILESET('scratch')*/, all contents under 'project', > 'proj1', 'proj2' are traversed as well, and I don't want that (it > takes too much time) > > * if I use /*FOR FILESET('scratch')*/, and instead of the whole device > I apply the policy to the /scratch mount point only, the policy still > traverses all the content of 'project', 'proj1', 'proj2', which I > don't want. (again, totally unexpected) > > QUESTION: > > How can I craft the syntax of the mmapplypolicy in combination with > the RULE filters, so that I can traverse all the contents under the > 'scratch' independent fileset, including the nested dependent filesets > 'scra1','scra2', and NOT traverse the other independent filesets at > all (since this takes too much time)? > > Thanks > Jaime > > > PS: FOR FILESET('scra*') does not work. > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= > > ************************************ > --- > Jaime Pinto - Storage Analyst > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.ca > University of Toronto > 661 University Ave. (MaRS), Suite 1140 > Toronto, ON, M5G1M1 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= > > > > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From makaplan at us.ibm.com Wed Apr 18 22:22:22 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 17:22:22 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Message-ID: It's more than hope. It works just as I wrote and documented and tested. Personally, I find the nomenclature for filesets and inodespaces as "independent filesets" unfortunate and leading to misunderstandings and confusion. But that train left the station a few years ago, so we just live with it... -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dirk.Thometzek at rohde-schwarz.com Thu Apr 19 08:44:00 2018 From: Dirk.Thometzek at rohde-schwarz.com (Dirk Thometzek) Date: Thu, 19 Apr 2018 07:44:00 +0000 Subject: [gpfsug-discuss] Career Opportunity Message-ID: <79f9ca3347214203b37003ac5dc288c7@rohde-schwarz.com> Dear all, I am working with a development team located in Hanover, Germany. Currently we are looking for a Spectrum Scale professional with long term experience to support our team in a senior development position. If you are interested, please send me a private message to: dirk.thometzek at rohde-schwarz.com Best regards, Dirk Thometzek Product Management File Based Media Solutions [RS_Logo_cyan_rgb - Klein] Rohde & Schwarz GmbH & Co. KG Pf. 80 14 69, D-81614 Muenchen Abt. MU Phone: +49 511 67807-0 Gesch?ftsf?hrung / Executive Board: Christian Leicher (Vorsitzender / Chairman), Peter Riedel Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRA 16 270 Pers?nlich haftender Gesellschafter / Personally Liable Partner: RUSEG Verwaltungs-GmbH | Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRB 7 534 | Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: DE 130 256 683 | Elektro-Altger?te Register (EAR) / WEEE Register No.: DE 240 437 86 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 6729 bytes Desc: image001.jpg URL: From delmard at br.ibm.com Thu Apr 19 14:37:14 2018 From: delmard at br.ibm.com (Delmar Demarchi) Date: Thu, 19 Apr 2018 11:37:14 -0200 Subject: [gpfsug-discuss] API - listing quotas Message-ID: Hello Experts. I'm trying to collect information from Fileset Quotas, using the API. I'm using this link as reference: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_apiv2version2.htm Do you know about any issue with Scale 5.0.x and API? Or what I have change in my command to collect this infos? Following the instruction on knowledge Center, we tried to list, using GET, the FILESET Quota but only USR and GRP were reported. Listing all quotas (using GET also), I found my quota there. See my sample: curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/filesets/sicredi/quotas " { "quotas" : [ { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 454, "quotaType" : "GRP" }, { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 501, "quotaType" : "USR" } ], "status" : { "code" : 200, "message" : "The request finished successfully." } }[root at lbsgpfs05 ~]# curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/quotas" { "quotas" : [ { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 0, "blockQuota" : 0, "blockUsage" : 512, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 1, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 0, "objectName" : "root", "quotaId" : 366, "quotaType" : "FILESET" }, { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 6598656, "blockQuota" : 6598656, "blockUsage" : 5670208, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 5, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 1, "objectName" : "sicredi", "quotaId" : 367, "quotaType" : "FILESET" } "status" : { "code" : 200, "message" : "The request finished successfully." } } mmlsquota -j sicredi fs1 --block-size auto Block Limits | File Limits Filesystem type blocks quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs1 FILESET 5.408G 6.293G 6.293G 0 none | 5 0 0 0 none mmrepquota -a *** Report for USR GRP FILESET quotas on fs1 Block Limits | File Limits Name fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace entryType sicredi root FILESET 5670208 6598656 6598656 0 none | 5 0 0 0 none e Regards, | Abrazos, | Atenciosamente, Delmar Demarchi .'. Power and Storage Services Specialist Phone: 55-19-2132-9469 | Mobile: 55-19-9 9792-1323 E-mail: delmard at br.ibm.com www.ibm.com/systems/services/labservices -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 2022 bytes Desc: not available URL: From A.Wolf-Reber at de.ibm.com Thu Apr 19 14:56:24 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Thu, 19 Apr 2018 13:56:24 +0000 Subject: [gpfsug-discuss] API - listing quotas In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729038.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729039.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729040.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._2_37157E80371579900049F11E83258274.jpg Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_371596C0371591D00049F11E83258274.gif Type: image/gif Size: 2022 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Fri Apr 20 15:01:55 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Fri, 20 Apr 2018 14:01:55 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 20 15:12:11 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 20 Apr 2018 14:12:11 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale In-Reply-To: References: Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6@bham.ac.uk> Sorry, it was a typo from my side. The talks that are missing we are chasing for copies of the slides that we can release. Simon From: on behalf of "Renar.Grunenberg at huk-coburg.de" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Friday, 20 April 2018 at 15:02 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Sun Apr 22 13:18:23 2018 From: nick.savva at adventone.com (Nick Savva) Date: Sun, 22 Apr 2018 12:18:23 +0000 Subject: [gpfsug-discuss] AFM cache re-link Message-ID: Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB's across the link that are already there. Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From sannaik2 at in.ibm.com Sun Apr 22 17:01:14 2018 From: sannaik2 at in.ibm.com (Sandeep Naik1) Date: Sun, 22 Apr 2018 21:31:14 +0530 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Sun Apr 22 18:07:01 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Sun, 22 Apr 2018 17:07:01 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz>, Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0D0D@SDEB-EXC02.meteo.dz> Hi sannaik after doing some research in the Cluster I find the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 I use the old drive of pdisk e2d3s11 to pdisk e3d3s06 now the pdisk e3d3s06 is in the wrong location SV30708502-3-11 ---- mmlspdisk e3d3s06 name = "e3d3s06" device = "/dev/sdfa,/dev/sdir" location = "SV30708502-3-11" userLocation = "Rack ess1 U11-14, Enclosure 1818-80E-SV30708502 Drawer 3 Slot 11" ---- and the pdisk e2d3s11 is without location mmlspdisk e2d3s11 name = "e2d3s11" device = " " location = "" userLocation = "" --- if i use the script replace-at-location for e3d3s06 SV25304899-3-6 replace-at-location BB1RGL e3d3s06 SV25304899-3-6 replace-at-location: error: pdisk e3d3s06 of RG BB1RGL is in location SV30708502-3-11, not SV25304899-3-6. Check the pdisk name and location code before continuing. if i use the script replace-at-location for e3d3s06 SV30708502-3-11 replace-at-location BB1RGL e3d3s06 SV30708502-3-11 location SV30708502-3-11 has a location if i use replace-at-location BB1RGR e2d3s11 SV30708502-3-11 Disk descriptor for /dev/sdfc,/dev/sdiq refers to an existing pdisk. the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 the disk found in location SV30708502-3-11 is not a blank disk because is a lready used by e3d3s06 why e3d3s06 is take the place of e2d3s11 and is stil working Thanks Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : Sandeep Naik1 [sannaik2 at in.ibm.com] Envoy? : dimanche 22 avril 2018 17:01 ? : atmane khiredine Cc : gpfsug main discussion list Objet : Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= From coetzee.ray at gmail.com Sun Apr 22 23:38:41 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Sun, 22 Apr 2018 23:38:41 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Good evening all I'm working with IBM on a PMR where ganesha is segfaulting or causing kernel panics on one group of CES nodes. We have 12 identical CES nodes split into two groups of 6 nodes each & have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. Only one group started having issues Monday morning where ganesha would segfault and the mounts would move over to the remaining nodes. The remaining nodes then start to fall over like dominos within minutes or hours to the point that all CES nodes are "failed" according to "mmces node list" and the VIP's are unassigned. Recovering the nodes are extremely finicky and works for a few minutes or hours before segfaulting again. Most times a complete stop of Ganesha on all nodes & then only starting it on two random nodes allow mounts to recover for a while. None of the following has helped: A reboot of all nodes. Refresh CCR config file with mmsdrrestore Remove/add CES from nodes. Reinstall GPFS & protocol rpms Update to 5.0.0-2 Fresh reinstall of a node Network checks out with no dropped packets on either data or export networks. The only temporary fix so far has been to downrev ganesha to 2.3.2 from 2.5.3 on the affected nodes. While waiting for IBM development, has anyone seen something similar maybe? Kind regards Ray Coetzee On Sat, Apr 21, 2018 at 12:00 PM, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) > 2. Re: UK Meeting - tooling Spectrum Scale > (Simon Thompson (IT Research Support)) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 20 Apr 2018 14:01:55 +0000 > From: "Grunenberg, Renar" > To: "'gpfsug-discuss at spectrumscale.org'" > > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: > Content-Type: text/plain; charset="utf-8" > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/91e3d84d/attachment-0001.html> > > ------------------------------ > > Message: 2 > Date: Fri, 20 Apr 2018 14:12:11 +0000 > From: "Simon Thompson (IT Research Support)" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > Sorry, it was a typo from my side. > > The talks that are missing we are chasing for copies of the slides that we > can release. > > Simon > > From: on behalf of " > Renar.Grunenberg at huk-coburg.de" > Reply-To: "gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > Date: Friday, 20 April 2018 at 15:02 > To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: > > 09561 96-44110 > > Telefax: > > 09561 96-44104 > > E-Mail: > > Renar.Grunenberg at huk-coburg.de > > Internet: > > www.huk.de > > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/0b8e9ffa/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 34 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 00:02:09 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 01:02:09 +0200 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Yes, I've been struggelig with something similiar this week. Ganesha dying with SIGABRT -- nothing else logged. After catching a few coredumps, it has been identified as a problem with some udp-communication during mounts from solaris clients. Disabling udp as transport on the shares serverside didn't help. It was suggested to use "mount -o tcp" or whatever the solaris version of this is -- but we haven't tested this. So far the downgrade to v2.3.2 has been our workaround. PMR: 48669,080,678 -jf On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee wrote: > Good evening all > > I'm working with IBM on a PMR where ganesha is segfaulting or causing > kernel panics on one group of CES nodes. > > We have 12 identical CES nodes split into two groups of 6 nodes each & > have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. > > Only one group started having issues Monday morning where ganesha would > segfault and the mounts would move over to the remaining nodes. > The remaining nodes then start to fall over like dominos within minutes or > hours to the point that all CES nodes are "failed" according to "mmces node > list" and the VIP's are unassigned. > > Recovering the nodes are extremely finicky and works for a few minutes or > hours before segfaulting again. > Most times a complete stop of Ganesha on all nodes & then only starting it > on two random nodes allow mounts to recover for a while. > > None of the following has helped: > A reboot of all nodes. > Refresh CCR config file with mmsdrrestore > Remove/add CES from nodes. > Reinstall GPFS & protocol rpms > Update to 5.0.0-2 > Fresh reinstall of a node > Network checks out with no dropped packets on either data or export > networks. > > The only temporary fix so far has been to downrev ganesha to 2.3.2 from > 2.5.3 on the affected nodes. > > While waiting for IBM development, has anyone seen something similar maybe? > > Kind regards > > Ray Coetzee > > > > On Sat, Apr 21, 2018 at 12:00 PM, spectrumscale.org> wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> >> Today's Topics: >> >> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >> 2. Re: UK Meeting - tooling Spectrum Scale >> (Simon Thompson (IT Research Support)) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Fri, 20 Apr 2018 14:01:55 +0000 >> From: "Grunenberg, Renar" >> To: "'gpfsug-discuss at spectrumscale.org'" >> >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: >> Content-Type: text/plain; charset="utf-8" >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: 09561 96-44110 >> Telefax: 09561 96-44104 >> E-Mail: Renar.Grunenberg at huk-coburg.de >> Internet: www.huk.de >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/91e3d84d/attachment-0001.html> >> >> ------------------------------ >> >> Message: 2 >> Date: Fri, 20 Apr 2018 14:12:11 +0000 >> From: "Simon Thompson (IT Research Support)" >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >> Content-Type: text/plain; charset="utf-8" >> >> Sorry, it was a typo from my side. >> >> The talks that are missing we are chasing for copies of the slides that >> we can release. >> >> Simon >> >> From: on behalf of " >> Renar.Grunenberg at huk-coburg.de" >> Reply-To: "gpfsug-discuss at spectrumscale.org" < >> gpfsug-discuss at spectrumscale.org> >> Date: Friday, 20 April 2018 at 15:02 >> To: "gpfsug-discuss at spectrumscale.org" >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: >> >> 09561 96-44110 >> >> Telefax: >> >> 09561 96-44104 >> >> E-Mail: >> >> Renar.Grunenberg at huk-coburg.de >> >> Internet: >> >> www.huk.de >> >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/0b8e9ffa/attachment-0001.html> >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> End of gpfsug-discuss Digest, Vol 75, Issue 34 >> ********************************************** >> > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coetzee.ray at gmail.com Mon Apr 23 00:23:55 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Mon, 23 Apr 2018 00:23:55 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Hi Jan-Frode We've been told the same regarding mounts using UDP. Our exports are already explicitly configured for TCP and the client's fstab's set to use TCP. It would be infuriating if the clients are trying UDP first irrespective of the mount options configured. Why the problem started specifically last week for both of us is interesting. Kind regards Ray Coetzee Mob: +44 759 704 7060 Skype: ray.coetzee Email: coetzee.ray at gmail.com On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust wrote: > > Yes, I've been struggelig with something similiar this week. Ganesha dying > with SIGABRT -- nothing else logged. After catching a few coredumps, it has > been identified as a problem with some udp-communication during mounts from > solaris clients. Disabling udp as transport on the shares serverside didn't > help. It was suggested to use "mount -o tcp" or whatever the solaris > version of this is -- but we haven't tested this. So far the downgrade to > v2.3.2 has been our workaround. > > PMR: 48669,080,678 > > > -jf > > > On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee > wrote: > >> Good evening all >> >> I'm working with IBM on a PMR where ganesha is segfaulting or causing >> kernel panics on one group of CES nodes. >> >> We have 12 identical CES nodes split into two groups of 6 nodes each & >> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >> released. >> >> Only one group started having issues Monday morning where ganesha would >> segfault and the mounts would move over to the remaining nodes. >> The remaining nodes then start to fall over like dominos within minutes >> or hours to the point that all CES nodes are "failed" according to >> "mmces node list" and the VIP's are unassigned. >> >> Recovering the nodes are extremely finicky and works for a few minutes or >> hours before segfaulting again. >> Most times a complete stop of Ganesha on all nodes & then only starting >> it on two random nodes allow mounts to recover for a while. >> >> None of the following has helped: >> A reboot of all nodes. >> Refresh CCR config file with mmsdrrestore >> Remove/add CES from nodes. >> Reinstall GPFS & protocol rpms >> Update to 5.0.0-2 >> Fresh reinstall of a node >> Network checks out with no dropped packets on either data or export >> networks. >> >> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >> 2.5.3 on the affected nodes. >> >> While waiting for IBM development, has anyone seen something similar >> maybe? >> >> Kind regards >> >> Ray Coetzee >> >> >> >> On Sat, Apr 21, 2018 at 12:00 PM, > umscale.org> wrote: >> >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>> 2. Re: UK Meeting - tooling Spectrum Scale >>> (Simon Thompson (IT Research Support)) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>> From: "Grunenberg, Renar" >>> To: "'gpfsug-discuss at spectrumscale.org'" >>> >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: >>> Content-Type: text/plain; charset="utf-8" >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: 09561 96-44110 >>> Telefax: 09561 96-44104 >>> E-Mail: Renar.Grunenberg at huk-coburg.de >>> Internet: www.huk.de >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/91e3d84d/attachment-0001.html> >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>> From: "Simon Thompson (IT Research Support)" >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>> Content-Type: text/plain; charset="utf-8" >>> >>> Sorry, it was a typo from my side. >>> >>> The talks that are missing we are chasing for copies of the slides that >>> we can release. >>> >>> Simon >>> >>> From: on behalf of " >>> Renar.Grunenberg at huk-coburg.de" >>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>> gpfsug-discuss at spectrumscale.org> >>> Date: Friday, 20 April 2018 at 15:02 >>> To: "gpfsug-discuss at spectrumscale.org" >> > >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: >>> >>> 09561 96-44110 >>> >>> Telefax: >>> >>> 09561 96-44104 >>> >>> E-Mail: >>> >>> Renar.Grunenberg at huk-coburg.de >>> >>> Internet: >>> >>> www.huk.de >>> >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/0b8e9ffa/attachment-0001.html> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>> ********************************************** >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 06:00:26 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 05:00:26 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: It started for me after upgrade from v4.2.x.x to 5.0.0.1 with RHEL7.4. Strangely not immediately, but 2 days after the upgrade (wednesday evening CET). Also I have some doubts that mount -o tcp will help, since TCP should already be the default transport. Have asked for if we can rather block this serverside using iptables. But, I expect we should get a fix soon, and we?ll stick with v2.3.2 until that. -jf man. 23. apr. 2018 kl. 01:23 skrev Ray Coetzee : > Hi Jan-Frode > We've been told the same regarding mounts using UDP. > Our exports are already explicitly configured for TCP and the client's > fstab's set to use TCP. > It would be infuriating if the clients are trying UDP first irrespective > of the mount options configured. > > Why the problem started specifically last week for both of us is > interesting. > > Kind regards > > Ray Coetzee > Mob: +44 759 704 7060 > > Skype: ray.coetzee > > Email: coetzee.ray at gmail.com > > > On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust > wrote: > >> >> Yes, I've been struggelig with something similiar this week. Ganesha >> dying with SIGABRT -- nothing else logged. After catching a few coredumps, >> it has been identified as a problem with some udp-communication during >> mounts from solaris clients. Disabling udp as transport on the shares >> serverside didn't help. It was suggested to use "mount -o tcp" or whatever >> the solaris version of this is -- but we haven't tested this. So far the >> downgrade to v2.3.2 has been our workaround. >> >> PMR: 48669,080,678 >> >> >> -jf >> >> >> On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee >> wrote: >> >>> Good evening all >>> >>> I'm working with IBM on a PMR where ganesha is segfaulting or causing >>> kernel panics on one group of CES nodes. >>> >>> We have 12 identical CES nodes split into two groups of 6 nodes each & >>> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >>> released. >>> >>> Only one group started having issues Monday morning where ganesha would >>> segfault and the mounts would move over to the remaining nodes. >>> The remaining nodes then start to fall over like dominos within minutes >>> or hours to the point that all CES nodes are "failed" according to >>> "mmces node list" and the VIP's are unassigned. >>> >>> Recovering the nodes are extremely finicky and works for a few minutes >>> or hours before segfaulting again. >>> Most times a complete stop of Ganesha on all nodes & then only starting >>> it on two random nodes allow mounts to recover for a while. >>> >>> None of the following has helped: >>> A reboot of all nodes. >>> Refresh CCR config file with mmsdrrestore >>> Remove/add CES from nodes. >>> Reinstall GPFS & protocol rpms >>> Update to 5.0.0-2 >>> Fresh reinstall of a node >>> Network checks out with no dropped packets on either data or export >>> networks. >>> >>> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >>> 2.5.3 on the affected nodes. >>> >>> While waiting for IBM development, has anyone seen something similar >>> maybe? >>> >>> Kind regards >>> >>> Ray Coetzee >>> >>> >>> >>> On Sat, Apr 21, 2018 at 12:00 PM, < >>> gpfsug-discuss-request at spectrumscale.org> wrote: >>> >>>> Send gpfsug-discuss mailing list submissions to >>>> gpfsug-discuss at spectrumscale.org >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> or, via email, send a message with subject or body 'help' to >>>> gpfsug-discuss-request at spectrumscale.org >>>> >>>> You can reach the person managing the list at >>>> gpfsug-discuss-owner at spectrumscale.org >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of gpfsug-discuss digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>>> 2. Re: UK Meeting - tooling Spectrum Scale >>>> (Simon Thompson (IT Research Support)) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>>> From: "Grunenberg, Renar" >>>> To: "'gpfsug-discuss at spectrumscale.org'" >>>> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: 09561 96-44110 >>>> Telefax: 09561 96-44104 >>>> E-Mail: Renar.Grunenberg at huk-coburg.de >>>> Internet: www.huk.de >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/91e3d84d/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>>> From: "Simon Thompson (IT Research Support)" >>>> To: gpfsug main discussion list >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Sorry, it was a typo from my side. >>>> >>>> The talks that are missing we are chasing for copies of the slides that >>>> we can release. >>>> >>>> Simon >>>> >>>> From: on behalf of " >>>> Renar.Grunenberg at huk-coburg.de" >>>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Date: Friday, 20 April 2018 at 15:02 >>>> To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: >>>> >>>> 09561 96-44110 >>>> >>>> Telefax: >>>> >>>> 09561 96-44104 >>>> >>>> E-Mail: >>>> >>>> Renar.Grunenberg at huk-coburg.de >>>> >>>> Internet: >>>> >>>> www.huk.de >>>> >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/0b8e9ffa/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>>> ********************************************** >>>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Apr 23 11:56:19 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 23 Apr 2018 16:26:19 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 15:10:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 14:10:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback Message-ID: Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 23 16:08:14 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 23 Apr 2018 15:08:14 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of "luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 20:54:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 19:54:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: > My very unconsidered and unsupported suggestion would be to edit mmfsfuncs > on your test cluster and see if it?s actually implemented further in the > code ? > > > > Simon > > > > *From: * on behalf of " > luke.raimbach at googlemail.com" > *Reply-To: *"gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > *Date: *Monday, 23 April 2018 at 15:11 > *To: *"gpfsug-discuss at spectrumscale.org" > > *Subject: *[gpfsug-discuss] afmPrepopEnd Callback > > > > Good Afternoon AFM Experts, > > > > I looked in the manual for afmPreopopEnd event variables I can extract to > log something useful after a prefetch event completes. Here is the manual > entry: > > > > %prepopAlreadyCachedFiles > > Specifies the number of files that are cached. > > These number of files are not read into cache > > because data is same between cache and home. > > > > However, when I try to install a callback like this, I get the associated > error: > > > > # mmaddcallback afmCompletionReport --command > /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName > %filesetName %prepopCompletedReads %prepopFailedReads > %prepopAlreadyCachedFiles %prepopData" > > mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was > specified. > > mmaddcallback: Command failed. Examine previous error messages to > determine cause. > > > > I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three > %prepop variables listed: > > > > %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; > > %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; > > %prepopdata ) validCallbackVariable="%prepopData";; > > > > Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? > > > > Unusual to see the manual ahead of the code ;) > > > > Cheers, > > Luke > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Tue Apr 24 07:47:54 2018 From: nick.savva at adventone.com (Nick Savva) Date: Tue, 24 Apr 2018 06:47:54 +0000 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:38:17 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:08:17 +0530 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Luke, This issue has been fixed now. You could either request efix or try workaround as suggested by Simon. The following parameters are supported. prepopCompletedReads prepopFailedReads prepopData This one is missing from the mmfsfuncs and is fixed now. prepopAlreadyCachedFiles ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 04/24/2018 01:25 AM Subject: Re: [gpfsug-discuss] afmPrepopEnd Callback Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of " luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=CKY14hxZ-5Ur87lPVdFwwcpuP1lfw-0_vyYhZCcf1pk&s=C058esOcmGSwBjnUblCLIJEpF4CKsXAos0Ap57R6A4Q&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:42:25 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:12:25 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: RO cache filesets doesn't support failover command. Is NICKTESTFSET RO mode fileset ? >The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. mmafmctl failover/resync commands does not remove extra files at home, if home is empty this won't be an issue. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: gpfsug main discussion list Date: 04/24/2018 12:18 PM Subject: Re: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" < gpfsug-discuss at spectrumscale.org> Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ZSOnMkeNsw6v92UHjeMBC3XPHfpzZlHBMAOJcNpXuNE&s=dZGOYMPF40W5oLiOu-cyilyYzFr4tWalJWKjo1D7PsQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Apr 24 10:20:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 24 Apr 2018 09:20:41 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a "dpnd" next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 11:42:47 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 12:42:47 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Message-ID: Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From makaplan at us.ibm.com Tue Apr 24 13:38:16 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:38:16 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 13:49:05 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:49:05 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 16:08:03 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:08:03 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, the system is not in action yet. I am just to plan the migrations right now. premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx): /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Tue Apr 24 16:20:14 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:20:14 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: (Sorry, vi commands and my mailing client do not co-operate well ...Pls forgive the pre-mature posting just sent. ) Hi, the system is not in action yet. I am just to plan the migrations right now. 1) + 2) premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx), executed once per hour: RULE EXTERNAL POOL 'hsm_aggr_cosnn' EXEC '/opt/hpss/bin/ghi_migrate' OPTS '-a -c nn' /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Internal Migration was originally intended like RULE "pool0_to_pool1" MIGRATE FROM POOL 'pool0' TO POOL 'pool1' WHERE (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '720' MINUTES) to be ran once per day (plus a thrshold-policy to prevent filling up of the relative small pool0 should something go wrong) 3) + 4) as that is not yet set up - no answer. I would like to prevent such things to happen in the first place, therefore asking. 5) I do not understand your question. I am planning the migration set up, and a suppose there might be the risk for trouble when doing internal migrations while haveing data pre-migrated to external. Indeed I am not a developer of the ILM stuff in Scale, so I cannot fully foresee what'll happen. Therefore asking. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Tue Apr 24 17:00:39 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 12:00:39 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: I added support to mmapplypolicy & co for HPSS (and TSM/HSM) years ago. AFAIK, it works pretty well, and pretty much works in a not-too-surprising way. I believe that besides the Spectrum Scale Admin and Command docs there are some "red book"s to help, and some IBM support people know how to use these features. When using mmapplypolicy: a) Migration and pre-migration to an external pool requires at least two rules: An EXTERNAL POOL rule to define the external pool, say 'xpool' and a MIGRATE ... TO POOL 'xpool' rule. b) Migration between GPFS native pools requires at least one MIGRATE ... FROM POOL 'apool' TO POOL 'bpool' rule. c) During any one execution of mmapplypolicy any one particular file will be subject of at most one MIGRATE rule. In other words file x will be either (pre)MIGRATEd to an external pool. OR MIGRATED between gpfs pools. BUT not both. (Hmm... well you could write a custom migration script and name that script in your EXTERNAL POOL rule and do anything you like to each file x that is chosen for "external" MIGRATion... But still just one MIGRATE rule governs file x.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 22:10:31 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 17:10:31 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Wed Apr 25 00:10:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 01:10:52 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, Marc, thx. I understand that being premigrated to an external pool should not affect the internal migration of a file. FYI: This is not the typical "gold - silver - bronze" setup with a one-dimensional migration path. Instead, one of the internal pools (pool0) is used to receive files written in very small records, the other (pool1) is the "normal" pool and receives all other files. Files written to pool0 should move to pool1 once they are closed (i.e. complete), but pool 0 has enough capacity to live without off-migration to pool1 for a few days, thus I'd thought to keep the frequency of that migration to not more than once per day. The external pool serves as a remote async mirror to achieve some resiliency against FS failures and also unintentional file deletion (metadata / SOBAR backups and file listings to keep the HPSS coordinates of GPFS files are done regularly), only in the long run data will be purged from pool1. Thus, migration to external should be done in shorter intervals. Sounds like I can go ahead without hesitation. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 23:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From valdis.kletnieks at vt.edu Wed Apr 25 01:09:52 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 24 Apr 2018 20:09:52 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: <108483.1524614992@turing-police.cc.vt.edu> On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From UWEFALKE at de.ibm.com Wed Apr 25 08:48:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 09:48:18 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: <108483.1524614992@turing-police.cc.vt.edu> References: <108483.1524614992@turing-police.cc.vt.edu> Message-ID: Hi, we rely on some scheme of file names. Splitting by path / fileset does not work here as small- and large-record data have to be co-located. Small-record files will only be recognised if carrying some magic strings in the file name. This is not a normal user system, but ingests data generated automatically, and thus a systematic naming of files is possible to a large extent. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Date: 25/04/2018 02:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) [attachment "attnayq3.dat" deleted by Uwe Falke/Germany/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 09:46:40 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 10:46:40 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Message-ID: Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano From r.sobey at imperial.ac.uk Wed Apr 25 10:01:33 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 09:01:33 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Apr 25 10:42:08 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 25 Apr 2018 09:42:08 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Wed Apr 25 11:44:46 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 10:44:46 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: Message-ID: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Hallo Ivano, we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). # The query port number defaults to 9084. queryport = "9084" query2port = "9094" We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo Gesendet: Mittwoch, 25. April 2018 10:47 An: gpfsug main discussion list Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 12:37:02 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 13:37:02 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Renar.Grunenberg at huk-coburg.de Wed Apr 25 13:11:53 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 12:11:53 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: <0815ca423fad48f9b3f149cd2eb1b143@SMXRF105.msg.hukrf.de> Hallo Ivano We use filter only. For cpu_user ->node = pm_filter($byNode) Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: Ivano Talamo [mailto:Ivano.Talamo at psi.ch] Gesendet: Mittwoch, 25. April 2018 13:37 An: gpfsug main discussion list ; Grunenberg, Renar Betreff: Re: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From david_johnson at brown.edu Wed Apr 25 13:14:27 2018 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 25 Apr 2018 08:14:27 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson > On Apr 25, 2018, at 5:42 AM, Daniel Kidger wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Apr 25 13:35:00 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 12:35:00 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! Main reason for us NOT to use dependent filesets is lack of snapshotting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of david_johnson at brown.edu Sent: 25 April 2018 13:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? A related question though: In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. Daniel [IBM Storage Professional Badge] [Image removed by sender.] Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list > Cc: Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Date: Wed, Apr 25, 2018 10:01 AM Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ~WRD000.jpg Type: image/jpeg Size: 823 bytes Desc: ~WRD000.jpg URL: From david_johnson at brown.edu Wed Apr 25 14:07:19 2018 From: david_johnson at brown.edu (David Johnson) Date: Wed, 25 Apr 2018 09:07:19 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > <~WRD000.jpg> > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org '" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Wed Apr 25 15:26:08 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 25 Apr 2018 10:26:08 -0400 Subject: [gpfsug-discuss] Encryption and ISKLM Message-ID: <22132.1524666368@turing-police.cc.vt.edu> We're running GPFS 4.2.3.7 with encryption on disk, LTFS/EE 1.2.6.2 with encryption on tape, and ISKLM 2.6.0.2 to manage the keys. I'm in the middle of researching RHEL patches on the key servers. Do I want to stay at 2.6.0.2, or go to a later 2.6, or jump to 2.7 or 3.0? Not seeing a lot of guidance on that topic.... From truongv at us.ibm.com Wed Apr 25 18:44:16 2018 From: truongv at us.ibm.com (Truong Vu) Date: Wed, 25 Apr 2018 13:44:16 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: >>> There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered Regarding the above query, it was answered on 5 Feb, 2018 > Subject: Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5? > Date: Mon, Feb 5, 2018 2:56 PM > Quoting "Truong Vu" : > >> >> Hi Jamie, >> >> The limits are the same in 5.0.0. We'll look into the FAQ. >> >> Thanks, >> Tru. BTW, the FAQ has been has been tweak a bit in this area. Thanks, Tru. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 04/25/2018 09:09 AM Subject: gpfsug-discuss Digest, Vol 75, Issue 48 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Converting a dependent fileset to independent (David Johnson) ---------------------------------------------------------------------- Message: 1 Date: Wed, 25 Apr 2018 09:07:19 -0400 From: David Johnson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Content-Type: text/plain; charset="utf-8" Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > <~WRD000.jpg> < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org < mailto:gpfsug-discuss at spectrumscale.org>'" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= >_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=s-Q5O2mSZkgTv9Vw1sikGpoIyxbhCqQ0mpMD-M_8f_E&e= > -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.p7s&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=TM2ZRp5gYFZ5sI3I0Obdf5h-aNBKNzm9tuWX3rqepaM&e= > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= End of gpfsug-discuss Digest, Vol 75, Issue 48 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ulmer at ulmer.org Thu Apr 26 03:48:23 2018 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 25 Apr 2018 22:48:23 -0400 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Message-ID: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Thu Apr 26 07:36:13 2018 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Thu, 26 Apr 2018 09:36:13 +0300 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? In-Reply-To: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> References: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> Message-ID: Hi You knew the answer, still is no. https://www.mail-archive.com/gpfsug-discuss at spectrumscale.org/msg02249.html -- Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations Luis Bolinches Consultant IT Specialist Mobile Phone: +358503112585 https://www.youracclaim.com/user/luis-bolinches "If you always give you will always have" -- Anonymous From: Stephen Ulmer To: gpfsug main discussion list Date: 26/04/2018 05:58 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Sent by: gpfsug-discuss-bounces at spectrumscale.org I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=1mZ896psa5caYzBeaugTlc7TtRejJp3uvKYxas3S7Xc&m=8_UIPpKhNk91nrRD8-6YIFZZXAX8-cxWiEUSTFLM_rY&s=AMlIQVIzjj6hG0agQvN2AAev3cj2MXe1AvqEpxMvnNU&e= Ellei edell? ole toisin mainittu: / Unless stated otherwise above: Oy IBM Finland Ab PL 265, 00101 Helsinki, Finland Business ID, Y-tunnus: 0195876-3 Registered in Finland -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:20:22 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:20:22 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 15:35:29 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 15:35:29 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Apr 26 15:40:52 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 26 Apr 2018 10:40:52 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > > > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > > > > > Bob Oesterlin > > > Sr Principal Storage Engineer, Nuance > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > Nathan?Harper?//?IT Systems Lead > > > e:?nathan.harper at cfms.org.uk???t:?0117 906 1104??m:? 0787 551 0891??w:?www.cfms.org.uk > CFMS Services Ltd?//?Bristol & Bath Science Park?//?Dirac Crescent?//?Emersons Green?//?Bristol?//?BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office //?43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:51:30 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:51:30 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: Hi Lohit, Nathan Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "valleru at cbio.mskcc.org" Reply-To: gpfsug main discussion list Date: Thursday, April 26, 2018 at 9:45 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Nathan Harper // IT Systems Lead [Image removed by sender.] e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR [Image removed by sender.] CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Thu Apr 26 15:51:19 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 26 Apr 2018 14:51:19 +0000 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: , <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: An HTML attachment was scrubbed... URL: From yguvvala at cambridgecomputer.com Thu Apr 26 15:53:58 2018 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Thu, 26 Apr 2018 10:53:58 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: Message-ID: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> I am interested to learn this too. So please add me sending a direct mail. Thanks, Yugi > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert wrote: > > Hi Lohit, Nathan > > Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > From: on behalf of "valleru at cbio.mskcc.org" > Reply-To: gpfsug main discussion list > Date: Thursday, April 26, 2018 at 9:45 AM > To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > We do run Singularity + GPFS, on our production HPC clusters. > Most of the time things are fine without any issues. > > However, i do see a significant performance loss when running some applications on singularity containers with GPFS. > > As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) > When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. > I am yet to raise a PMR about this with IBM. > I have not seen performance degradation for any other kind of IO, but i am not sure. > > Regards, > Lohit > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 16:25:54 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 16:25:54 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: Happy to share on the list in case anyone else finds it useful: We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. On 26 April 2018 at 15:53, Yugendra Guvvala wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > Hi Lohit, Nathan > > > > Would you be willing to share some more details about your setup? We are > just getting started here and I would like to hear about what your > configuration looks like. Direct email to me is fine, thanks. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > > *From: * on behalf of " > valleru at cbio.mskcc.org" > *Reply-To: *gpfsug main discussion list > *Date: *Thursday, April 26, 2018 at 9:45 AM > *To: *gpfsug main discussion list > *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > > > We do run Singularity + GPFS, on our production HPC clusters. > > Most of the time things are fine without any issues. > > > > However, i do see a significant performance loss when running some > applications on singularity containers with GPFS. > > > > As of now, the applications that have severe performance issues with > singularity on GPFS - seem to be because of ?mmap io?. (Deep learning > applications) > > When i run the same application on bare metal, they seem to have a huge > difference in GPFS IO when compared to running on singularity containers. > > I am yet to raise a PMR about this with IBM. > > I have not seen performance degradation for any other kind of IO, but i am > not sure. > > > Regards, > Lohit > > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , > wrote: > > We are running on a test system at the moment, and haven't run into any > issues yet, but so far it's only been 'hello world' and running FIO. > > > > I'm interested to hear about experience with MPI-IO within Singularity. > > > > On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: > > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > > *Nathan* *Harper* // IT Systems Lead > > > > [image: Image removed by sender.] > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > > > > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > [image: Image removed by sender.] > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Apr 26 16:31:32 2018 From: david_johnson at brown.edu (David Johnson) Date: Thu, 26 Apr 2018 11:31:32 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Regarding MPI-IO, how do you mean ?building the applications against GPFS?? We try to advise our users about things to avoid, but we have some poster-ready ?chaos monkeys? as well, who resist guidance. What apps do your users favor? Molpro is one of our heaviest apps right now. Thanks, ? ddj > On Apr 26, 2018, at 11:25 AM, Nathan Harper wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala > wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert > wrote: > >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> From: > on behalf of "valleru at cbio.mskcc.org " > >> Reply-To: gpfsug main discussion list > >> Date: Thursday, April 26, 2018 at 9:45 AM >> To: gpfsug main discussion list > >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) >> >> When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper >, wrote: >> >> >> We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> >> Nathan Harper // IT Systems Lead >> >> >> >> >> >> e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR >> >> >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From nathan.harper at cfms.org.uk Thu Apr 26 17:00:56 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 17:00:56 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Message-ID: We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 26 19:08:48 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 26 Apr 2018 18:08:48 +0000 Subject: [gpfsug-discuss] Pool migration and replicate Message-ID: Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon From olaf.weiser at de.ibm.com Thu Apr 26 19:53:42 2018 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 26 Apr 2018 11:53:42 -0700 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From vborcher at linkedin.com Thu Apr 26 19:59:38 2018 From: vborcher at linkedin.com (Vanessa Borcherding) Date: Thu, 26 Apr 2018 18:59:38 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <690DC273-833D-419F-84A0-7EE2EC7700C1@linkedin.biz> Hi All, In my previous life at Weill Cornell, I benchmarked Singularity pretty extensively for bioinformatics applications on a GPFS 4.2 cluster, and saw virtually no overhead whatsoever. However, I did not allow MPI jobs for those workloads, so that may be the key differentiator here. You may wish to reach out to Greg Kurtzer and his team too - they're super responsive on github and have a slack channel that you can join. His email address is gmkurtzer at gmail.com. Vanessa ?On 4/26/18, 9:01 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of gpfsug-discuss-request at spectrumscale.org" wrote: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Singularity + GPFS (Nathan Harper) ---------------------------------------------------------------------- Message: 1 Date: Thu, 26 Apr 2018 17:00:56 +0100 From: Nathan Harper To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Singularity + GPFS Message-ID: Content-Type: text/plain; charset="utf-8" We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 56 ********************************************** From makaplan at us.ibm.com Thu Apr 26 21:30:14 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 26 Apr 2018 16:30:14 -0400 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: No need to specify REPLICATE(1), but no harm either. No need to specify a FROM POOL, unless you want to restrict the set of files considered. (consider a system with more than two pools...) If a file is already in the target (TO) POOL, then no harm, we just skip over that file. From: "Simon Thompson (IT Research Support)" To: gpfsug main discussion list Date: 04/26/2018 02:09 PM Subject: [gpfsug-discuss] Pool migration and replicate Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=9Ko588DKk_71GheOwRqmDO1vVI24OTUvUBdYv8YHIbU&s=04zxf_-EsPu_LN--gsPx7GEPRsqUW7jIZ1Biov8R3mY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Apr 27 09:40:44 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 27 Apr 2018 10:40:44 +0200 Subject: [gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive In-Reply-To: References: <0081EB235765E14395278B9AE1DF341846510A@MBX214.d.ethz.ch> <4AD44D34-5275-4ADB-8CC7-8E80170DDA7F@brown.edu> Message-ID: Alternative solution we're trying... Create the file /etc/systemd/system/gpfs.service.d/delay.conf containing: [Service] ExecStartPre=/bin/sleep 60 Then I expect we should have long enough delay for infiniband to start before starting gpfs.. -jf On Fri, Mar 16, 2018 at 1:05 PM, Frederick Stock wrote: > I have my doubts that mmdiag can be used in this script. In general the > guidance is to avoid or be very careful with mm* commands in a callback due > to the potential for deadlock. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > stockf at us.ibm.com > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 03/16/2018 04:30 AM > > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports > tobecomeactive > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Thanks Olaf, but we don't use NetworkManager on this cluster.. > > I now created this simple script: > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > #! /bin/bash - > # > # Fail mmstartup if not all configured IB ports are active. > # > # Install with: > # > # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail > --event preStartup --sync --onerror shutdown > # > > for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f > 4- -d " ") > do > grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state > || exit 1 > done > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > > which I haven't tested, but assume should work. Suggestions for > improvements would be much appreciated! > > > > -jf > > > On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <*olaf.weiser at de.ibm.com* > > wrote: > > you can try : > systemctl enable NetworkManager-wait-online > ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' > '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From: Jan-Frode Myklebust <*janfrode at tanso.net* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/15/2018 06:18 PM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > becomeactive > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > I found some discussion on this at > *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25* > and > there it's claimed that none of the callback events are early enough to > resolve this. That we need a pre-preStartup trigger. Any idea if this has > changed -- or is the callback option then only to do a "--onerror > shutdown" if it has failed to connect IB ? > > > On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > You could also use the GPFS prestartup callback (mmaddcallback) to execute > a script synchronously that waits for the IB ports to become available > before returning and allowing GPFS to continue. Not systemd integrated but > it should work. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: *david_johnson at brown.edu* > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/08/2018 07:34 AM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > become active > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > > Until IBM provides a solution, here is my workaround. Add it so it runs > before the gpfs script, I call it from our custom xcat diskless boot > scripts. Based on rhel7, not fully systemd integrated. YMMV! > > Regards, > ? ddj > ??- > [ddj at storage041 ~]$ cat /etc/init.d/ibready > #! /bin/bash > # > # chkconfig: 2345 06 94 > # /etc/rc.d/init.d/ibready > # written in 2016 David D Johnson (ddj *brown.edu* > > ) > # > ### BEGIN INIT INFO > # Provides: ibready > # Required-Start: > # Required-Stop: > # Default-Stop: > # Description: Block until infiniband is ready > # Short-Description: Block until infiniband is ready > ### END INIT INFO > > RETVAL=0 > if [[ -d /sys/class/infiniband ]] > then > IBDEVICE=$(dirname $(grep -il infiniband > /sys/class/infiniband/*/ports/1/link* | head -n 1)) > fi > # See how we were called. > case "$1" in > start) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo -n "Polling for InfiniBand link up: " > for (( count = 60; count > 0; count-- )) > do > if grep -q ACTIVE $IBDEVICE/state > then > echo ACTIVE > break > fi > echo -n "." > sleep 5 > done > if (( count <= 0 )) > then > echo DOWN - $0 timed out > fi > fi > ;; > stop|restart|reload|force-reload|condrestart|try-restart) > ;; > status) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo "$IBDEVICE is $(< $IBDEVICE/state) $(< > $IBDEVICE/rate)" > else > echo "No IBDEVICE found" > fi > ;; > *) > echo "Usage: ibready {start|stop|status|restart| > reload|force-reload|condrestart|try-restart}" > exit 2 > esac > exit ${RETVAL} > ???? > > -- ddj > Dave Johnson > > On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) < > *marc.caubet at psi.ch* > wrote: > > Hi all, > > with autoload = yes we do not ensure that GPFS will be started after the > IB link becomes up. Is there a way to force GPFS waiting to start until IB > ports are up? This can be probably done by adding something like > After=network-online.target and Wants=network-online.target in the systemd > file but I would like to know if this is natively possible from the GPFS > configuration. > > Thanks a lot, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > WHGA/036 > 5232 Villigen PSI > Switzerland > > Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067> > E-Mail: *marc.caubet at psi.ch* > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s= > veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Mon Apr 30 22:11:35 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 30 Apr 2018 17:11:35 -0400 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Message-ID: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 30 22:57:17 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 30 Apr 2018 21:57:17 +0000 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts In-Reply-To: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> References: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Message-ID: You have been able to do this for some time, though I think it's only just supported. We've been exporting remote mounts since CES was added. At some point we've had two storage clusters supplying data and at least 3 remote file-systems exported over NFS and SMB. One thing to watch, be careful if your CES root is on a remote fs, as if that goes away, so do all CES exports. We do have CES root on a remote fs and it works, just be aware... Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valleru at cbio.mskcc.org [valleru at cbio.mskcc.org] Sent: 30 April 2018 22:11 To: gpfsug main discussion list Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit From knop at us.ibm.com Mon Apr 2 04:16:25 2018 From: knop at us.ibm.com (Felipe Knop) Date: Sun, 1 Apr 2018 23:16:25 -0400 Subject: [gpfsug-discuss] sublocks per block in GPFS 5.0 In-Reply-To: References: <68905b2c-8b1a-4a3d-8ded-c5aa56b765aa@Spark><18518530-0d1f-4937-b2ec-9c16c6c80995@Spark> Message-ID: Folks, Also quoting a previous post: Thanks Mark, I did not know, we could explicitly mention sub-block size when creating File system. It is no-where mentioned in the ?man mmcrfs?. Is this a new GPFS 5.0 feature? Also, i see from the ?man mmcrfs? that the default sub-block size for 8M and 16M is 16K. Specifying the number of subblocks per block or the subblock size in mmcrfs is not currently supported. The subblock size is automatically chosen based on the block size, as described in 'Table 1. Block sizes and subblock sizes' in 'man mmcrfs'. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 03/30/2018 02:48 PM Subject: Re: [gpfsug-discuss] sublocks per block in GPFS 5.0 Sent by: gpfsug-discuss-bounces at spectrumscale.org Look at my example, again, closely. I chose the blocksize as 16M and subblock size as 4K and the inodesize as 1K.... Developer works is a good resource, but articles you read there may be incomplete or contain mistakes. The official IBM Spectrum Scale cmd and admin guide documents, are "trustworthy" but may not be perfect in all respects. "Trust but Verify" and YMMV. ;-) As for why/how to choose "good sizes", that depends what objectives you want to achieve, and "optimal" may depend on what hardware you are running. Run your own trials and/or ask performance experts. There are usually "tradeoffs" and OTOH when you get down to it, some choices may not be all-that-important in actual deployment and usage. That's why we have defaults values - try those first and leave the details and tweaking aside until you have good reason ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=6_33HG_HPw9JkUKuyY_SrveiPQ_bnA4JHZ0F7l01ohc&s=HLsts8ySRm-SVYLUNhCt2SxsoP3Ph02ehKmGnqpXbPc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From scale at us.ibm.com Mon Apr 2 08:11:54 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 2 Apr 2018 12:41:54 +0530 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: Hi Alexander, Markus, Can you please try to answer the below query. Or else forward this to the right folks. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Altenburger Ingo (ID SD)" To: "gpfsug-discuss at spectrumscale.org" Date: 03/29/2018 05:57 PM Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' Sent by: gpfsug-discuss-bounces at spectrumscale.org We were very hopeful to replace our storage provisioning automation based on cli commands with the new functions provided in REST API. Since it seems that almost all protocol related commands are already implemented with 5.0.0.1 REST interface, we have still not found an equivalent for mmsmb exportacl list to get the share permissions of a share. Does anybody know that this is already in but not yet documented or is it for sure still not under consideration? Thanks Ingo _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=djjNl-TRGJKujImpbqTzsuNhnILtchzGBzZBdLJbyY0&s=4e6Azge_v1-AApWi_xNPI6V8qSW58ZOxIwFma-A6nss&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 11:41:41 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 11:41:41 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale Message-ID: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page [1]. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From richardb+gpfsUG at ellexus.com Tue Apr 3 12:28:19 2018 From: richardb+gpfsUG at ellexus.com (Richard Booth) Date: Tue, 3 Apr 2018 12:28:19 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: Hi Claire The link at the bottom of your email, doesn't appear to be working. Richard On 3 April 2018 at 12:00, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Transforming Workflows at Scale (Secretary GPFS UG) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 03 Apr 2018 11:41:41 +0100 > From: Secretary GPFS UG > To: gpfsug main discussion list > Subject: [gpfsug-discuss] Transforming Workflows at Scale > Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> > Content-Type: text/plain; charset="us-ascii" > > > > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being > organised by IBM at IBM South Bank, London on 17th April (the day before > the next UK meeting). > > The event has been designed for broadcasters, post production houses and > visual effects organisations, where managing workflows between different > islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM > Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org > > > Links: > ------ > [1] > https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp? > openform&seminar=B223GVES&locale=en_ZZ&cm_mmc= > Email_External-_-Systems_Systems+-+Hybrid+Cloud+ > Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_ > mmca1=000030YP&cm_mmca2=10001939&cvosrc=email. > External.NA&cvo_campaign=000030YP > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180403/302ad054/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 2 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 12:56:33 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 12:56:33 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Hi Richard, My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [7] If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Richard Booth wrote: > Hi Claire > > The link at the bottom of your email, doesn't appear to be working. > > Richard > > On 3 April 2018 at 12:00, wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> Today's Topics: >> >> 1. Transforming Workflows at Scale (Secretary GPFS UG) >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 03 Apr 2018 11:41:41 +0100 >> From: Secretary GPFS UG >> To: gpfsug main discussion list >> Subject: [gpfsug-discuss] Transforming Workflows at Scale >> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >> Content-Type: text/plain; charset="us-ascii" >> >> Dear all, >> >> There's a Spectrum Scale for media breakfast briefing event being >> organised by IBM at IBM South Bank, London on 17th April (the day before >> the next UK meeting). >> >> The event has been designed for broadcasters, post production houses and >> visual effects organisations, where managing workflows between different >> islands of technology is a major challenge. >> >> If you're interested, you can read more and register at the IBM >> Registration Page [1]. >> >> Thanks, >> -- >> >> Claire O'Toole >> Spectrum Scale/GPFS User Group Secretary >> +44 (0)7508 033896 [2] >> www.spectrumscaleug.org [3] >> >> Links: >> ------ >> [1] >> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [4] >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org [6] >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> >> End of gpfsug-discuss Digest, Vol 75, Issue 2 >> ********************************************* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] Links: ------ [1] http://gpfsug.org/mailman/listinfo/gpfsug-discuss [2] tel:%2B44%20%280%297508%20033896 [3] http://www.spectrumscaleug.org [4] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&amp;seminar=B223GVES&amp;locale=en_ZZ&amp;cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&amp;cm_mmca1=000030YP&amp;cm_mmca2=10001939&amp;cvosrc=email.External.NA&amp;cvo_campaign=000030YP [5] http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180403/302ad054/attachment-0001.html [6] http://spectrumscale.org [7] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.Wolf-Reber at de.ibm.com Tue Apr 3 16:26:45 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Tue, 3 Apr 2018 15:26:45 +0000 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780210.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780211.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780212.png Type: image/png Size: 1134 bytes Desc: not available URL: From john.hearns at asml.com Wed Apr 4 10:11:48 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 09:11:48 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance "If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command." -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Apr 4 19:56:56 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 4 Apr 2018 18:56:56 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: <59DE0638-07DC-4C10-A981-F4EEE6A60D89@nuance.com> Short answer is that if you want to change/remove the NSD server config on an NSD and its part of a file systems, you need to remove it from the file system or unmount the file system. *Thankfully* this is changed in Scale 5.0. In your case (host name change) ? if the IP address of the NSD server stays the same you *may* be OK. Can you put a DNS alias in for the old host name? Well, now that I think about it the old host name will stick around in the config ? so maybe not such a great idea. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hearns Reply-To: gpfsug main discussion list Date: Wednesday, April 4, 2018 at 1:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Dual server NSDs Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Wed Apr 4 11:02:09 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 10:02:09 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Message-ID: Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6---> RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Wed Apr 4 20:59:56 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 04 Apr 2018 15:59:56 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <49633.1522871996@turing-police.cc.vt.edu> On Wed, 04 Apr 2018 10:02:09 -0000, John Hearns said: > Has anyone done a procedure like this? We recently got to rename all 10 nodes in a GPFS cluster to make the unqualified name unique (turned out that having 2 nodes called 'arnsd1.isb.mgt' and 'arnsd1.vtc.mgt' causes all sorts of confusion). So they got renamed to arnsd1-isb.yadda.yadda and arnsd1-vtc.yadda.yadda. Unmount, did the mmchnsd server list thing, start going through the servers, rename and reboot each one. We did hit a whoopsie because I forgot to fix the list of quorum/manager nodes as we did each node - so don't forget to run mmchnode for each system if/when appropriate... From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 21:50:13 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 20:50:13 +0000 Subject: [gpfsug-discuss] Local event Message-ID: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 19:52:33 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 18:52:33 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: References: Message-ID: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Hi John, Yes, you can remove one of the servers and yes, we?ve done it and yes, the documentation is clear and correct. ;-) Last time I did this we were in a full cluster downtime, so unmounting wasn?t an issue. We were changing our network architecture and so the IP addresses of all NSD servers save one were changing. It was a bit ? uncomfortable ? for the brief period of time I had to make the one NSD server the one and only NSD server for ~1 PB of storage! But it worked just fine? HTHAL? Kevin On Apr 4, 2018, at 4:11 AM, John Hearns > wrote: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance ?If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command.? -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alevin at gmail.com Wed Apr 4 22:39:08 2018 From: alevin at gmail.com (Alex Levin) Date: Wed, 04 Apr 2018 21:39:08 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> References: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Message-ID: We are doing the similar procedure right now. Migrating from one group of nsd servers to another. Unfortunately, as I understand, if you can't afford the cluster/filesystem downtime and not ready for 5.0 upgrade yet ( personally I'm not comfortable with ".0" versions of software in production :) ) - the only way to do it is remove disk/nsd from filesystem and add it back with the new servers list. Taking a while , a lot of i/o ... John, in case the single nsd filesystem, I'm afraid, you'll have to unmount it to change .... --Alex On Wed, Apr 4, 2018, 2:25 PM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi John, > > Yes, you can remove one of the servers and yes, we?ve done it and yes, the > documentation is clear and correct. ;-) > > Last time I did this we were in a full cluster downtime, so unmounting > wasn?t an issue. We were changing our network architecture and so the IP > addresses of all NSD servers save one were changing. It was a bit ? > uncomfortable ? for the brief period of time I had to make the one NSD > server the one and only NSD server for ~1 PB of storage! But it worked > just fine? > > HTHAL? > > Kevin > > On Apr 4, 2018, at 4:11 AM, John Hearns wrote: > > I should say I already have a support ticket open for advice on this issue. > We have a filesystem which has NSDs which have two servers defined, for > instance: > nsd: > device=/dev/sdb > servers=sn007,sn008 > nsd=nsd1 > usage=dataOnly > > Can I remove one of these servers? The object is to upgrade this server > and change its hostname, the physical server will stay in place. > Has anyone carried out an operation similar to this? > > I guess the documentation here is quite clear: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance > > ?If you want to change configuration for a NSD which is already belongs > to a file system, you need to unmount the file system before running > mmchnsd command.? > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Thu Apr 5 02:57:15 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 03:57:15 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: Hm, you can change the host name of a Scale node. I've done that a while ago on one or two clusters. >From what I remember I'd follow these steps: 1. Upgrade the OS configuring/using the old IP addr/hostname (2. Reinstall Scale) (3. Replay the cluster data on the node) 4. Create an interface with the new IP address on the node (not necessarily connected) 5. Ensure the node is not required for quorum and has currently no mgr role. You might want to stop Scale on the node. 5. mmchnode -N --daemon-interface ; mmchnode -N --admin-interface . Now the node has kind of disappeared, if the new IF is not yet functional, until you bring that IF up (6.) (6. Activate connection to other cluster nodes via new IF) 2. and 3. are required if scale was removed / the system was re-set up from scratch 6. is required if the new IP connection config.ed in 4 is not operational at first (e.g. not yet linked, or routing not yet active, ...) Et voila, the server should be happy again, if stopped before, start up Scale and check. No warranties, But that's how I'd try. As usual: if messing with IP config, be sure to have a back door to the system in case you ground the OS network config . Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: John Hearns To: gpfsug main discussion list Date: 04/04/2018 21:33 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Sent by: gpfsug-discuss-bounces at spectrumscale.org Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6-? RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Thu Apr 5 03:25:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 04:25:18 +0200 Subject: [gpfsug-discuss] Local event In-Reply-To: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> References: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Message-ID: Hi Kevin , I suppose the quota check is done when the writing node allocates blocks to write to. mind: the detour via NSD servers is transparent for that layer, GPFS may switch between SCSI/SAN paths to a (direct-.attached) block device and the NSD service via a separate NSD server, both ways are logically similar for the writing node (or should be for your matter). In short: yes, I think you need to roll out your "quota exceeded" call-back to all nodes in the HPC cluster. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 04/04/2018 22:51 Subject: [gpfsug-discuss] Local event Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Thu Apr 5 10:30:22 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Thu, 05 Apr 2018 10:30:22 +0100 Subject: [gpfsug-discuss] RFE Process ... Burning Issues Message-ID: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 11:09:07 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 10:09:07 +0000 Subject: [gpfsug-discuss] UK April meeting Message-ID: It?s now just two weeks until the UK meeting and we are down to our last few places available. If you were planning on attending, please register now! Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 1 March 2018 at 11:26 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] UK April meeting Hi All, We?ve just posted the draft agenda for the UK meeting in April at: http://www.spectrumscaleug.org/event/uk-2018-user-group-event/ So far, we?ve issued over 50% of the available places, so if you are planning to attend, please do register now! Please register at: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-2018-registration-41489952565?aff=MailingList We?ve also confirmed our evening networking/social event between days 1 and 2 with thanks to our sponsors for supporting this. Please remember that we are currently limiting to two registrations per organisation. We?d like to thank our sponsors from DDN, E8, Ellexus, IBM, Lenovo, NEC and OCF for supporting the event. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Apr 5 14:37:35 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 5 Apr 2018 09:37:35 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 15:27:38 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 14:27:38 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: on behalf of "makaplan at us.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Thu Apr 5 16:27:52 2018 From: john.hearns at asml.com (John Hearns) Date: Thu, 5 Apr 2018 15:27:52 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Thankyou everyone for replies on this issue. Very helpful. We have a test setup with three nodes, although no multi-pathed disks. So I can try out removing and replacing disks servers. I agree with Simon that bringing up a second NIC is probably inviting Murphy in to play merry hell? The option we are envisioning is re-installing the server(s) but leaving them with the existing FQDNs if we can. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Thursday, April 05, 2018 4:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: > on behalf of "makaplan at us.ibm.com" > Reply-To: "gpfsug-discuss at spectrumscale.org" > Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Fri Apr 6 00:12:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Fri, 6 Apr 2018 01:12:52 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Hi John, some last thoughts mmdelnode/mmaddnode is an easy way to move non-NSD servers, but doing so for NSD servers requires to run mmchnsd, and that again requires a downtime for the file system the NSDs are part of (in Scale 4 at least, what we are talking right here). That could only be circumvented by mmdeldisk/mmadddisk the NSDs of the NSD server to be moved (with all the restriping). If that's ok for you go ahead. Else I think you might give the mmchnode way a second thought. I'd stop GPFS on the server to be moved (although that should also be hot-swappable) which should prevent any havoc for Scale and offers you plenty of opportunity to check your final new network set-up, before starting Scale on that renewed node. YMMV, and you might try different methods on your test system of course. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From sjhoward at iu.edu Fri Apr 6 16:20:59 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:20:59 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5 -> 4.x Message-ID: <1523028060.8115.12.camel@iu.edu> Hi All, We were wondering what the group's experiences have been with upgrading export nodes from 3.5, especially those upgrades that involved a transition from home-grown ADS domain integration to the new CES integration piece. Specifcially, we're interested in: 1) ?What changes were necessary to make in your domain to get it to interoperate with CES? 2) ?Any good tips for CES workarounds in the case of domain configuration that cannot be changed? 3) ?Experience with CES user-defined auth mode in particular? ?Has anyone got this mode to work successfullly? Let us know. ?Thanks! Stewart Howard Indiana University From sjhoward at iu.edu Fri Apr 6 16:14:48 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:14:48 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Message-ID: <1523027688.8115.6.camel@iu.edu> Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart From r.sobey at imperial.ac.uk Fri Apr 6 17:00:09 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 6 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: Hi Stewart We're running a synchronous replication cluster between our DCs in London and Slough, at a distance of ~63km. The latency is in the order of 700 microseconds over dark fibre. Honestly... it's been a fine experience. We've never had a full connectivity loss mind you, but we have had to shut down one site fully whilst the other one carried on as normal. Mmrestripe afterwards of course. We are running Scale version 4.2.3 and looking at v5. Hardware is IBM v3700 storage, IBM rackmount NSD/CES nodes. The storage is connected via FC. Cheers Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Howard, Stewart Jameson Sent: 06 April 2018 16:15 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Apr 6 18:42:53 2018 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 6 Apr 2018 17:42:53 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5-> 4.x In-Reply-To: <1523028060.8115.12.camel@iu.edu> References: <1523028060.8115.12.camel@iu.edu> Message-ID: An HTML attachment was scrubbed... URL: From YARD at il.ibm.com Sat Apr 7 18:27:49 2018 From: YARD at il.ibm.com (Yaron Daniel) Date: Sat, 7 Apr 2018 20:27:49 +0300 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: HI We have few customers than have 2 Sites (Active/Active using SS replication) + 3rd site as Quorum Tie Breaker node. 1) Spectrum Scale 4.2.3.x 2) Lenovo x3650 -M4 connect via FC to SVC (Flash900 as external storage) 3) We run all tests before deliver the system to customer Production. Main items to take into account : 1) What is the latecny you have between the 2 main sites ? 2) What network bandwidth between the 2 sites ? 3) What is the latency to the 3rd site from each site ? 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? 5) Do you plan to use dedicated network for GPFS daemon ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Howard, Stewart Jameson" To: "gpfsug-discuss at spectrumscale.org" Date: 04/06/2018 06:24 PM Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. Specifically, we're interested in: 1) What SS version are you running? 2) What hardware are you running it on? 3) What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=yYIveWTR3gNyhJ9KsrodpWApBlpQ29Oi858MuE0Nzsw&s=V42UYnHtEYVK3LvH6i930tzte1qp0sWmiY6Pp1Ep3kg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 11294 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Sun Apr 8 17:21:34 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Sun, 08 Apr 2018 12:21:34 -0400 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: References: <1523027688.8115.6.camel@iu.edu> Message-ID: <230460.1523204494@turing-police.cc.vt.edu> On Sat, 07 Apr 2018 20:27:49 +0300, "Yaron Daniel" said: > Main items to take into account : > 1) What is the latecny you have between the 2 main sites ? > 2) What network bandwidth between the 2 sites ? > 3) What is the latency to the 3rd site from each site ? > 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? > 5) Do you plan to use dedicated network for GPFS daemon ? The answers to most of these questions are a huge "it depends". For instance, the bandwidth needed is dictated by the amount of data being replicated. The cluster I mentioned the other day was filling most of a 10Gbit link while we were importing 5 petabytes of data from our old archive solution, but now often fits its replication needs inside a few hundred mbits/sec. Similarly, the answers to (4) and (5) will depend on what long-haul network infrastructure the customer already has or can purchase. If they have layer 2 capability between the sites, that's an option. If they've just got commodity layer-3, you're designing with layer 3 in mind. If their network has VLAN capability between the sites, or a dedicated link, that will affect the answer for (5). And so on... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From vpuvvada at in.ibm.com Mon Apr 9 05:52:56 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 9 Apr 2018 10:22:56 +0530 Subject: [gpfsug-discuss] AFM-DR Questions In-Reply-To: References: Message-ID: Hi, > - Any reason why we changed the Recovery point objective (RPO) snapshots by 15 minutes to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? AFM DR doesn't require RPO snapshots for replication, it is continuous replication. Unless there is a need for crash consistency snapshots (applications like databases need write ordering), RPO interval 15 minutes simply puts load on the system as they have to created and deleted for every 15 minutes. >- Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? Yes, command "mmpsnap --rpo" can be used to create RPO snapshots. Some users disable RPO on filesets and cron job is used to create RPO snapshots based on requirement. >- In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? If applications can failover to live filesystem, it is not required to restore from the snapshot. Applications which needs crash consistency will restore from the latest snapshot during failover. AFM DR maintains at most 2 RPO snapshots. >- According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are under an existing Independent fileset. Do you see any issues/concerns with this? AFM DR doesn't support dependent filesets. Users won't be allowed to create them or convert to AFM DR fileset if they already exists. ~Venkat (vpuvvada at in.ibm.com) From: "Delmar Demarchi" To: gpfsug-discuss at spectrumscale.org Date: 03/29/2018 07:12 PM Subject: [gpfsug-discuss] AFM-DR Questions Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello experts. We have a Scale project with AFM-DR to be implemented and after read the KC documentation, we have some questions about. - Do you know any reason why we changed the Recovery point objective (RPO) snapshots by 15 to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? - Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? - In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? - According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are part of an existing Independent fileset. Do you see any issues/concerns with this? Thank you in advance. Delmar Demarchi .'. (delmard at br.ibm.com)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ERiLT5aa1e1r1QyLkokJhA1Q5frqqgQ-g90JT0MGQvQ&s=KVjGaS1dG0luvtm0yh4rBpKNbUquTGuf2FSmaNBIOIM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 9 10:00:26 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 09:00:26 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly Message-ID: Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the "Max Inodes" column. I've verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.roth at de.ibm.com Mon Apr 9 10:38:48 2018 From: stefan.roth at de.ibm.com (Stefan Roth) Date: Mon, 9 Apr 2018 11:38:48 +0200 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. |-----------------+----------------------+-------------------------------------------+---------+> |Mit freundlichen | | | || |Gr??en / Kind | | | || |regards | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Stefan Roth | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Spectrum Scale | | | || |GUI Development | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Phone: |+49-7034-643-1362 | IBM Deutschland | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |E-Mail: |stefan.roth at de.ibm.com| Am Weiher 24 | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | 65451 Kelsterbach | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | Germany | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |IBM Deutschland | | | || |Research & | | | || |Development | | | || |GmbH / | | | || |Vorsitzender des | | | || |Aufsichtsrats: | | | || |Martina Koederitz| | | || | | | | || |Gesch?ftsf?hrung:| | | || |Dirk Wittkopp | | | || |Sitz der | | | || |Gesellschaft: | | | || |B?blingen / | | | || |Registergericht: | | | || |Amtsgericht | | | || |Stuttgart, HRB | | | || |243294 | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19702371.gif Type: image/gif Size: 156 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19171259.gif Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19868035.gif Type: image/gif Size: 63 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Apr 9 11:19:50 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 10:19:50 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Thanks Stefan, very interesting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3CFF4.ABF16450] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3CFF4.ABF16450] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3CFF4.ABF16450] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From john.hearns at asml.com Mon Apr 9 15:43:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:43:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so In-Reply-To: References: Message-ID: And I have fixed my own issue... In the chroot environment: mount -t proc /proc /proc Rookie mistake. Head hung in shame. But I beg forgiveness. My first comps Sci lecturer, Jennifer Haselgrove at Glasgow, taught us an essential programming technique on day one. Always discuss your program with your cat. Sit down with him or her, and talk them through the algorithm, and any bugs which you have. It is a very effective technique. I thank you all for being stand-in cats. As an aside, I will not be at the London meeting next week. Would be good to put some faces to names, and to seek out beer. I am sure IBMers can point you all in the correct direction for that. From: John Hearns Sent: Monday, April 09, 2018 4:37 PM To: gpfsug main discussion list Subject: Installer cannot find libdbgwrapper70.so I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Mon Apr 9 15:37:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:37:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so Message-ID: I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Apr 9 18:17:52 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 9 Apr 2018 17:17:52 +0000 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Message-ID: Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Apr 9 18:20:38 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 9 Apr 2018 17:20:38 +0000 Subject: [gpfsug-discuss] Reminder: SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Only a little over a month away! The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. W have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Mon Apr 9 23:51:05 2018 From: nick.savva at adventone.com (Nick Savva) Date: Mon, 9 Apr 2018 22:51:05 +0000 Subject: [gpfsug-discuss] Device mapper Message-ID: Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Apr 10 01:04:12 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 10 Apr 2018 00:04:12 +0000 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <6c952e81c58940a19114ee1c976501e0@jumptrading.com> Hi Nick, You are correct. You need to update the nsddevices file to look in /dev/mapper. Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Nick Savva Sent: Monday, April 09, 2018 5:51 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Device mapper Note: External Email ________________________________ Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Tue Apr 10 03:27:13 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 10 Apr 2018 02:27:13 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Message-ID: Claire/ Richard et al. The link works for me also, but I agree that the URL is complex and ugly. I am sure there must be a simpler URL with less embedded metadata that could be used? eg. Cutting it down to this appears to still work: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 3 Apr 2018, at 04:56, Secretary GPFS UG wrote: > > Hi Richard, > > My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP > > If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com > > Many thanks, > > --- > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org >> On , Richard Booth wrote: >> >> Hi Claire >> >> The link at the bottom of your email, doesn't appear to be working. >> >> Richard >> >>> On 3 April 2018 at 12:00, wrote: >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Transforming Workflows at Scale (Secretary GPFS UG) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Tue, 03 Apr 2018 11:41:41 +0100 >>> From: Secretary GPFS UG >>> To: gpfsug main discussion list >>> Subject: [gpfsug-discuss] Transforming Workflows at Scale >>> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >>> Content-Type: text/plain; charset="us-ascii" >>> >>> >>> >>> Dear all, >>> >>> There's a Spectrum Scale for media breakfast briefing event being >>> organised by IBM at IBM South Bank, London on 17th April (the day before >>> the next UK meeting). >>> >>> The event has been designed for broadcasters, post production houses and >>> visual effects organisations, where managing workflows between different >>> islands of technology is a major challenge. >>> >>> If you're interested, you can read more and register at the IBM >>> Registration Page [1]. >>> >>> Thanks, >>> -- >>> >>> Claire O'Toole >>> Spectrum Scale/GPFS User Group Secretary >>> +44 (0)7508 033896 >>> www.spectrumscaleug.org >>> >>> >>> Links: >>> ------ >>> [1] >>> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 2 >>> ********************************************* >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=3aZjjrv3ym45au9B33YgmVP51qvaHXYad4WRjccMOdk&s=rnsXK8Eibl0HLAElxCQexfrV8ReoB8hOYlkk3PmhqN4&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From tortay at cc.in2p3.fr Tue Apr 10 06:51:25 2018 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Tue, 10 Apr 2018 07:51:25 +0200 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <0b9f3629-146f-3720-fda8-3d51c0c37614@cc.in2p3.fr> On 10/04/2018 00:51, Nick Savva wrote: > Hi all, > > Apologies in advance if this has been covered already in discussions. > > I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. > > I understand you can also copy the bindings file but I think aliases is probably easier to maintain. > > However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? > > Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? > Hello, We're doing this, indeed, using the "nsddevices" script. The names printed by the script must be relative to "/dev". Our script contains the following (our multipath aliases are "nsdXY"): cd /dev && for nsd in mapper/nsd* ; do [ -e $nsd ] && echo "$nsd dmm" done return 0 The meaning of "dmm" is described in "/usr/lpp/mmfs/bin/mmdevdiscover". Lo?c. -- | Lo?c Tortay - IN2P3 Computing Centre | From rohwedder at de.ibm.com Tue Apr 10 08:57:44 2018 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Tue, 10 Apr 2018 09:57:44 +0200 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error In-Reply-To: References: Message-ID: Hello Kevin, it could be that the "hysteresis" parameter is still set to a non zero value. You can check by using the mmhealth thresholds list --verbose command, or of course by using the Monitor>Thresholds page. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 7034 6430190 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 09.04.2018 19:18 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l6AoS-QQpHgDtZkWluGw6Lln0PEOyUeS1ujJR2o1Hjg&s=X6bQXF1YmSSq1QyOkQXHYF1NMhczdJSPtWL4fpjbZ24&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1A990285.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Apr 10 09:55:30 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 10 Apr 2018 08:55:30 +0000 Subject: [gpfsug-discuss] CES SMB export limit Message-ID: Is there a limit to the number of SMB exports we can create in CES? Figures being thrown around here suggest 256 but we'd like to know for sure. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroche at lenovo.com Tue Apr 10 11:13:49 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:13:49 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi Claire, Can I add a registration to the Lenovo listing please? One of our technical architects from Israel would like to attend the event. Can we add: Gilad Berman HPC Architect Lenovo EMEA [Phone]+972-52-2554262 [Email]gberman at lenovo.com To the Attendee list? Thanks, Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Secretary GPFS UG Sent: Tuesday, April 3, 2018 11:42 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Transforming Workflows at Scale Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 7770 bytes Desc: image004.gif URL: From jroche at lenovo.com Tue Apr 10 11:30:37 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:30:37 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi All, sorry for the spam?. Finger troubles. ? Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 92 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 128 bytes Desc: image006.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.gif Type: image/gif Size: 7770 bytes Desc: image007.gif URL: From carlz at us.ibm.com Tue Apr 10 16:33:54 2018 From: carlz at us.ibm.com (Carl Zetie) Date: Tue, 10 Apr 2018 15:33:54 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com From aaron.s.knister at nasa.gov Tue Apr 10 17:00:09 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Tue, 10 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior Message-ID: I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:22:46 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:22:46 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: I wonder if this is an artifact of pagepool exhaustion which makes me ask the question-- how do I see how much of the pagepool is in use and by what? I've looked at mmfsadm dump and mmdiag --memory and neither has provided me the information I'm looking for (or at least not in a format I understand). -Aaron On 4/10/18 12:00 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] wrote: > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Tue Apr 10 17:28:29 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 10 Apr 2018 12:28:29 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cphoffma at uoregon.edu Tue Apr 10 17:18:49 2018 From: cphoffma at uoregon.edu (Chris Hoffman) Date: Tue, 10 Apr 2018 16:18:49 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523377129792.79060@uoregon.edu> ?Hi Stumped, Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes to the same file or different ones? Chris ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Tuesday, April 10, 2018 9:00 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Confusing I/O Behavior I hate admitting this but I've found something that's got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don't appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don't understand is why these write requests aren't getting batched up into larger write requests to the underlying disks. If I do something like "df if=/dev/zero of=foo bs=8k" on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn't doing any fsync's and isn't doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:52:30 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:52:30 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523377129792.79060@uoregon.edu> References: <1523377129792.79060@uoregon.edu> Message-ID: Chris, The job runs across multiple nodes and the tinky 8K writes *should* be to different files that are unique per-rank. -Aaron On 4/10/18 12:18 PM, Chris Hoffman wrote: > ?Hi Stumped, > > > Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes > to the same file or different ones? > > > Chris > > ------------------------------------------------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org > on behalf of Knister, Aaron > S. (GSFC-606.2)[COMPUTER SCIENCE CORP] > *Sent:* Tuesday, April 10, 2018 9:00 AM > *To:* gpfsug main discussion list > *Subject:* [gpfsug-discuss] Confusing I/O Behavior > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From UWEFALKE at de.ibm.com Tue Apr 10 22:43:30 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 10 Apr 2018 23:43:30 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Hi Aaron, to how many different files do these tiny I/O requests go? Mind that the write aggregates the I/O over a limited time (5 secs or so) and ***per file***. It is for that matter a large difference to write small chunks all to one file or to a large number of individual files . to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes ***per file*** within those 5 secs. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" To: gpfsug main discussion list Date: 10/04/2018 18:09 Subject: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Wed Apr 11 09:22:12 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 11 Apr 2018 08:22:12 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Just the 1000 SMB shares limit was what I wanted but the other info was useful, thanks Carl. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie Sent: 10 April 2018 16:34 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES SMB export limit Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Wed Apr 11 11:14:21 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 11 Apr 2018 11:14:21 +0100 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523441661.19449.153.camel@strath.ac.uk> On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > Hi Aaron,? > to how many different files do these tiny I/O requests go? > > Mind that the write aggregates the I/O over a limited time (5 secs or > so)?and ***per file***.? > It is for that matter a large difference to write small chunks all to > one? > file or to a large number of individual files . > to fill a??1 MiB buffer you need about 13100 chunks of??80Bytes > ***per? > file*** within those 5 secs.? > Something else to bear in mind is that you might be using a library that converts everything into putchar's. I have seen this in the past with Office on a Mac platform and made performance saving a file over SMB/NFS appalling. I mean really really bad, a?"save as" which didn't do that would take a second or two, a save would take like 15 minutes. To the local disk it was just fine. The GPFS angle is this was all on a self rolled clustered Samba GPFS setup back in the day. Took a long time to track down, and performance turned out to be just as appalling with a real Windows file server. JAB. -- Jonathan A. Buzzard?????????????????????????Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From UWEFALKE at de.ibm.com Wed Apr 11 11:53:36 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 11 Apr 2018 12:53:36 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523441661.19449.153.camel@strath.ac.uk> References: <1523441661.19449.153.camel@strath.ac.uk> Message-ID: It would be interesting in which chunks data arrive at the NSDs -- if those chunks are bigger than the individual I/Os (i.e. multiples of the record sizes), there is some data coalescing going on and it just needs to have its path well paved ... If not, there might be indeed something odd in the configuration. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 gpfsug-discuss-bounces at spectrumscale.org wrote on 11/04/2018 12:14:21: > From: Jonathan Buzzard > To: gpfsug main discussion list > Date: 11/04/2018 12:14 > Subject: Re: [gpfsug-discuss] Confusing I/O Behavior > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > > Hi Aaron, > > to how many different files do these tiny I/O requests go? > > > > Mind that the write aggregates the I/O over a limited time (5 secs or > > so) and ***per file***. > > It is for that matter a large difference to write small chunks all to > > one > > file or to a large number of individual files . > > to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes > > ***per > > file*** within those 5 secs. > > > > Something else to bear in mind is that you might be using a library > that converts everything into putchar's. I have seen this in the past > with Office on a Mac platform and made performance saving a file over > SMB/NFS appalling. I mean really really bad, a "save as" which didn't > do that would take a second or two, a save would take like 15 minutes. > To the local disk it was just fine. > > The GPFS angle is this was all on a self rolled clustered Samba GPFS > setup back in the day. Took a long time to track down, and performance > turned out to be just as appalling with a real Windows file server. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Wed Apr 11 12:06:40 2018 From: peserocka at gmail.com (Peter Serocka) Date: Wed, 11 Apr 2018 13:06:40 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Wed Apr 11 12:21:04 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 11 Apr 2018 12:21:04 +0100 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hi All, At the UK meeting next week, we?ve had a speaker slot become available, we?re planning to put in a BoF type session on tooling Spectrum Scale so we have space for a few 3-5 minute quick talks on what people are doing to automate. If you are coming along and interested, please drop me an email. Max of 3 slides! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Wed Apr 11 15:36:29 2018 From: valleru at cbio.mskcc.org (Lohit Valleru) Date: Wed, 11 Apr 2018 10:36:29 -0400 Subject: [gpfsug-discuss] GPFS, MMAP and Pagepool In-Reply-To: References: Message-ID: Hey Sven, This is regarding mmap issues and GPFS. We had discussed previously of experimenting with GPFS 5. I now have upgraded all of compute nodes and NSD nodes to GPFS 5.0.0.2 I am yet to experiment with mmap performance, but before that - I am seeing weird hangs with GPFS 5 and I think it could be related to mmap. Have you seen GPFS ever hang on this syscall? [Tue Apr 10 04:20:13 2018] [] _ZN10gpfsNode_t8mmapLockEiiPKj+0xb5/0x140 [mmfs26] I see the above ,when kernel hangs and throws out a series of trace calls. I somehow think the above trace is related to processes hanging on GPFS forever. There are no errors in GPFS however. Also, I think the above happens only when the mmap threads go above a particular number. We had faced a similar issue in 4.2.3 and it was resolved in a patch to 4.2.3.2 . At that time , the issue happened when mmap threads go more than worker1threads. According to the ticket - it was a mmap race condition that GPFS was not handling well. I am not sure if this issue is a repeat and I am yet to isolate the incident and test with increasing number of mmap threads. I am not 100 percent sure if this is related to mmap yet but just wanted to ask you if you have seen anything like above. Thanks, Lohit On Feb 22, 2018, 3:59 PM -0500, Sven Oehme , wrote: > Hi Lohit, > > i am working with ray on a mmap performance improvement right now, which most likely has the same root cause as yours , see -->??http://gpfsug.org/pipermail/gpfsug-discuss/2018-January/004411.html > the thread above is silent after a couple of back and rorth, but ray and i have active communication in the background and will repost as soon as there is something new to share. > i am happy to look at this issue after we finish with ray's workload if there is something missing, but first let's finish his, get you try the same fix and see if there is something missing. > > btw. if people would share their use of MMAP , what applications they use (home grown, just use lmdb which uses mmap under the cover, etc) please let me know so i get a better picture on how wide the usage is with GPFS. i know a lot of the ML/DL workloads are using it, but i would like to know what else is out there i might not think about. feel free to drop me a personal note, i might not reply to it right away, but eventually. > > thx. sven > > > > On Thu, Feb 22, 2018 at 12:33 PM wrote: > > > Hi all, > > > > > > I wanted to know, how does mmap interact with GPFS pagepool with respect to filesystem block-size? > > > Does the efficiency depend on the mmap read size and the block-size of the filesystem even if all the data is cached in pagepool? > > > > > > GPFS 4.2.3.2 and CentOS7. > > > > > > Here is what i observed: > > > > > > I was testing a user script that uses mmap to read from 100M to 500MB files. > > > > > > The above files are stored on 3 different filesystems. > > > > > > Compute nodes - 10G pagepool and 5G seqdiscardthreshold. > > > > > > 1. 4M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > 2. 1M block size GPFS filesystem as a AFM cache cluster, "with all the required files fully cached" from the above GPFS cluster as home. Data and Metadata together on SSDs > > > 3. 16M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > > > > When i run the script first time for ?each" filesystem: > > > I see that GPFS reads from the files, and caches into the pagepool as it reads, from mmdiag -- iohist > > > > > > When i run the second time, i see that there are no IO requests from the compute node to GPFS NSD servers, which is expected since all the data from the 3 filesystems is cached. > > > > > > However - the time taken for the script to run for the files in the 3 different filesystems is different - although i know that they are just "mmapping"/reading from pagepool/cache and not from disk. > > > > > > Here is the difference in time, for IO just from pagepool: > > > > > > 20s 4M block size > > > 15s 1M block size > > > 40S 16M block size. > > > > > > Why do i see a difference when trying to mmap reads from different block-size filesystems, although i see that the IO requests are not hitting disks and just the pagepool? > > > > > > I am willing to share the strace output and mmdiag outputs if needed. > > > > > > Thanks, > > > Lohit > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Apr 11 17:51:33 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 11 Apr 2018 16:51:33 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From makaplan at us.ibm.com Wed Apr 11 18:23:02 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 11 Apr 2018 13:23:02 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Good point about "tiny" files going into the inode and system pool. Which reminds one: Generally a bad idea to store metadata in wide striping disk base RAID (Type 5 with spinning media) Do use SSD or similar for metadata. Consider smaller block size for metadata / system pool than regular file data. From: Bryan Banister To: gpfsug main discussion list Date: 04/11/2018 12:51 PM Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 13 21:05:53 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 13 Apr 2018 20:05:53 +0000 Subject: [gpfsug-discuss] Replicated and non replicated data Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC@bham.ac.uk> I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon [root at nsd01 ~]# mmlsdisk castles -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- CASTLES_GPFS_DESCONLY01 nsd 512 310 no no ready up 1 system desc stg01-01_3_3 nsd 4096 210 no yes ready down 4 6tnlsas stg01-01_4_4 nsd 4096 210 no yes ready down 5 6tnlsas stg01-01_5_5 nsd 4096 210 no yes ready down 6 6tnlsas stg01-01_6_6 nsd 4096 210 no yes ready down 7 6tnlsas stg01-01_7_7 nsd 4096 210 no yes ready down 8 6tnlsas stg01-01_8_8 nsd 4096 210 no yes ready down 9 6tnlsas stg01-01_9_9 nsd 4096 210 no yes ready down 10 6tnlsas stg01-01_10_10 nsd 4096 210 no yes ready down 11 6tnlsas stg01-01_11_11 nsd 4096 210 no yes ready down 12 6tnlsas stg01-01_12_12 nsd 4096 210 no yes ready down 13 6tnlsas stg01-01_13_13 nsd 4096 210 no yes ready down 14 6tnlsas stg01-01_14_14 nsd 4096 210 no yes ready down 15 6tnlsas stg01-01_15_15 nsd 4096 210 no yes ready down 16 6tnlsas stg01-01_16_16 nsd 4096 210 no yes ready down 17 6tnlsas stg01-01_17_17 nsd 4096 210 no yes ready down 18 6tnlsas stg01-01_18_18 nsd 4096 210 no yes ready down 19 6tnlsas stg01-01_19_19 nsd 4096 210 no yes ready down 20 6tnlsas stg01-01_20_20 nsd 4096 210 no yes ready down 21 6tnlsas stg01-01_21_21 nsd 4096 210 no yes ready down 22 6tnlsas stg01-01_ssd_54_54 nsd 4096 210 yes no ready down 23 system stg01-01_ssd_56_56 nsd 4096 210 yes no ready down 24 system stg02-01_0_0 nsd 4096 110 no yes ready up 25 6tnlsas stg02-01_1_1 nsd 4096 110 no yes ready up 26 6tnlsas stg02-01_2_2 nsd 4096 110 no yes ready up 27 6tnlsas stg02-01_3_3 nsd 4096 110 no yes ready up 28 6tnlsas stg02-01_4_4 nsd 4096 110 no yes ready up 29 6tnlsas stg02-01_5_5 nsd 4096 110 no yes ready up 30 6tnlsas stg02-01_6_6 nsd 4096 110 no yes ready up 31 6tnlsas stg02-01_7_7 nsd 4096 110 no yes ready up 32 6tnlsas stg02-01_8_8 nsd 4096 110 no yes ready up 33 6tnlsas stg02-01_9_9 nsd 4096 110 no yes ready up 34 6tnlsas stg02-01_10_10 nsd 4096 110 no yes ready up 35 6tnlsas stg02-01_11_11 nsd 4096 110 no yes ready up 36 6tnlsas stg02-01_12_12 nsd 4096 110 no yes ready up 37 6tnlsas stg02-01_13_13 nsd 4096 110 no yes ready up 38 6tnlsas stg02-01_14_14 nsd 4096 110 no yes ready up 39 6tnlsas stg02-01_15_15 nsd 4096 110 no yes ready up 40 6tnlsas stg02-01_16_16 nsd 4096 110 no yes ready up 41 6tnlsas stg02-01_17_17 nsd 4096 110 no yes ready up 42 6tnlsas stg02-01_18_18 nsd 4096 110 no yes ready up 43 6tnlsas stg02-01_19_19 nsd 4096 110 no yes ready up 44 6tnlsas stg02-01_20_20 nsd 4096 110 no yes ready up 45 6tnlsas stg02-01_21_21 nsd 4096 110 no yes ready up 46 6tnlsas stg02-01_ssd_22_22 nsd 4096 110 yes no ready up 47 system desc stg02-01_ssd_23_23 nsd 4096 110 yes no ready up 48 system stg02-01_ssd_24_24 nsd 4096 110 yes no ready up 49 system stg02-01_ssd_25_25 nsd 4096 110 yes no ready up 50 system stg01-01_22_22 nsd 4096 210 no yes ready up 51 6tnlsasnonrepl desc stg01-01_23_23 nsd 4096 210 no yes ready up 52 6tnlsasnonrepl stg01-01_24_24 nsd 4096 210 no yes ready up 53 6tnlsasnonrepl stg01-01_25_25 nsd 4096 210 no yes ready up 54 6tnlsasnonrepl stg01-01_26_26 nsd 4096 210 no yes ready up 55 6tnlsasnonrepl stg01-01_27_27 nsd 4096 210 no yes ready up 56 6tnlsasnonrepl stg01-01_31_31 nsd 4096 210 no yes ready up 58 6tnlsasnonrepl stg01-01_32_32 nsd 4096 210 no yes ready up 59 6tnlsasnonrepl stg01-01_33_33 nsd 4096 210 no yes ready up 60 6tnlsasnonrepl stg01-01_34_34 nsd 4096 210 no yes ready up 61 6tnlsasnonrepl stg01-01_35_35 nsd 4096 210 no yes ready up 62 6tnlsasnonrepl stg01-01_36_36 nsd 4096 210 no yes ready up 63 6tnlsasnonrepl stg01-01_37_37 nsd 4096 210 no yes ready up 64 6tnlsasnonrepl stg01-01_38_38 nsd 4096 210 no yes ready up 65 6tnlsasnonrepl stg01-01_39_39 nsd 4096 210 no yes ready up 66 6tnlsasnonrepl stg01-01_40_40 nsd 4096 210 no yes ready up 67 6tnlsasnonrepl stg01-01_41_41 nsd 4096 210 no yes ready up 68 6tnlsasnonrepl stg01-01_42_42 nsd 4096 210 no yes ready up 69 6tnlsasnonrepl stg01-01_43_43 nsd 4096 210 no yes ready up 70 6tnlsasnonrepl stg01-01_44_44 nsd 4096 210 no yes ready up 71 6tnlsasnonrepl stg01-01_45_45 nsd 4096 210 no yes ready up 72 6tnlsasnonrepl stg01-01_46_46 nsd 4096 210 no yes ready up 73 6tnlsasnonrepl stg01-01_47_47 nsd 4096 210 no yes ready up 74 6tnlsasnonrepl stg01-01_48_48 nsd 4096 210 no yes ready up 75 6tnlsasnonrepl stg01-01_49_49 nsd 4096 210 no yes ready up 76 6tnlsasnonrepl stg01-01_50_50 nsd 4096 210 no yes ready up 77 6tnlsasnonrepl stg01-01_51_51 nsd 4096 210 no yes ready up 78 6tnlsasnonrepl Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Apr 13 21:17:11 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 13 Apr 2018 20:17:11 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data Message-ID: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From sxiao at us.ibm.com Sat Apr 14 02:42:28 2018 From: sxiao at us.ibm.com (Steve Xiao) Date: Fri, 13 Apr 2018 21:42:28 -0400 Subject: [gpfsug-discuss] Replicated and non replicated data In-Reply-To: References: Message-ID: What is your unmountOnDiskFail configuration setting on the cluster? You need to set unmountOnDiskFail to meta if you only have metadata replication. Steve Y. Xiao > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 13 Apr 2018 20:05:53 +0000 > From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > > Subject: [gpfsug-discuss] Replicated and non replicated data > Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > I have a question about file-systems with replicated an non replicated data. > > We have a file-system where metadata is set to copies=2 and data > copies=2, we then use a placement policy to selectively replicate > some data only once based on file-set. We also place the non- > replicated data into a specific pool (6tnlsas) to ensure we know > where it is placed. > > My understanding was that in doing this, if we took the disks with > the non replicated data offline, we?d still have the FS available > for users as the metadata is replicated. Sure accessing a non- > replicated data file would give an IO error, but the rest of the FS > should be up. > > We had a situation today where we wanted to take stg01 offline > today, so tried using mmchdisk stop -d ?. Once we got to about disk > stg01-01_12_12, GPFS would refuse to stop any more disks and > complain about too many disks, similarly if we shutdown the NSD > servers hosting the disks, the filesystem would have an SGPanic and > force unmount. > > First, am I correct in thinking that a FS with non-replicated data, > but replicated metadata should still be accessible (not the non- > replicated data) when the LUNS hosting it are down? > > If so, any suggestions why my FS is panic-ing when we take down the > one set of disks? > > I thought at first we had some non-replicated metadata, tried a > mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but > this didn?t help. > > Running 5.0.0.2 on the NSD server nodes. > > (First time we went round this we didn?t have a FS descriptor disk, > but you can see below that we added this) > > Thanks > > Simon > > [root at nsd01 ~]# mmlsdisk castles -L > disk driver sector failure holds holds storage > name type size group metadata data status > availability disk id pool remarks > ------------ -------- ------ ----------- -------- ----- > ------------- ------------ ------- ------------ --------- > CASTLES_GPFS_DESCONLY01 nsd 512 310 no no > ready up 1 system desc > stg01-01_3_3 nsd 4096 210 no yes ready > down 4 6tnlsas > stg01-01_4_4 nsd 4096 210 no yes ready > down 5 6tnlsas > stg01-01_5_5 nsd 4096 210 no yes ready > down 6 6tnlsas > stg01-01_6_6 nsd 4096 210 no yes ready > down 7 6tnlsas > stg01-01_7_7 nsd 4096 210 no yes ready > down 8 6tnlsas > stg01-01_8_8 nsd 4096 210 no yes ready > down 9 6tnlsas > stg01-01_9_9 nsd 4096 210 no yes ready > down 10 6tnlsas > stg01-01_10_10 nsd 4096 210 no yes ready > down 11 6tnlsas > stg01-01_11_11 nsd 4096 210 no yes ready > down 12 6tnlsas > stg01-01_12_12 nsd 4096 210 no yes ready > down 13 6tnlsas > stg01-01_13_13 nsd 4096 210 no yes ready > down 14 6tnlsas > stg01-01_14_14 nsd 4096 210 no yes ready > down 15 6tnlsas > stg01-01_15_15 nsd 4096 210 no yes ready > down 16 6tnlsas > stg01-01_16_16 nsd 4096 210 no yes ready > down 17 6tnlsas > stg01-01_17_17 nsd 4096 210 no yes ready > down 18 6tnlsas > stg01-01_18_18 nsd 4096 210 no yes ready > down 19 6tnlsas > stg01-01_19_19 nsd 4096 210 no yes ready > down 20 6tnlsas > stg01-01_20_20 nsd 4096 210 no yes ready > down 21 6tnlsas > stg01-01_21_21 nsd 4096 210 no yes ready > down 22 6tnlsas > stg01-01_ssd_54_54 nsd 4096 210 yes no ready > down 23 system > stg01-01_ssd_56_56 nsd 4096 210 yes no ready > down 24 system > stg02-01_0_0 nsd 4096 110 no yes ready > up 25 6tnlsas > stg02-01_1_1 nsd 4096 110 no yes ready > up 26 6tnlsas > stg02-01_2_2 nsd 4096 110 no yes ready > up 27 6tnlsas > stg02-01_3_3 nsd 4096 110 no yes ready > up 28 6tnlsas > stg02-01_4_4 nsd 4096 110 no yes ready > up 29 6tnlsas > stg02-01_5_5 nsd 4096 110 no yes ready > up 30 6tnlsas > stg02-01_6_6 nsd 4096 110 no yes ready > up 31 6tnlsas > stg02-01_7_7 nsd 4096 110 no yes ready > up 32 6tnlsas > stg02-01_8_8 nsd 4096 110 no yes ready > up 33 6tnlsas > stg02-01_9_9 nsd 4096 110 no yes ready > up 34 6tnlsas > stg02-01_10_10 nsd 4096 110 no yes ready > up 35 6tnlsas > stg02-01_11_11 nsd 4096 110 no yes ready > up 36 6tnlsas > stg02-01_12_12 nsd 4096 110 no yes ready > up 37 6tnlsas > stg02-01_13_13 nsd 4096 110 no yes ready > up 38 6tnlsas > stg02-01_14_14 nsd 4096 110 no yes ready > up 39 6tnlsas > stg02-01_15_15 nsd 4096 110 no yes ready > up 40 6tnlsas > stg02-01_16_16 nsd 4096 110 no yes ready > up 41 6tnlsas > stg02-01_17_17 nsd 4096 110 no yes ready > up 42 6tnlsas > stg02-01_18_18 nsd 4096 110 no yes ready > up 43 6tnlsas > stg02-01_19_19 nsd 4096 110 no yes ready > up 44 6tnlsas > stg02-01_20_20 nsd 4096 110 no yes ready > up 45 6tnlsas > stg02-01_21_21 nsd 4096 110 no yes ready > up 46 6tnlsas > stg02-01_ssd_22_22 nsd 4096 110 yes no ready > up 47 system desc > stg02-01_ssd_23_23 nsd 4096 110 yes no ready > up 48 system > stg02-01_ssd_24_24 nsd 4096 110 yes no ready > up 49 system > stg02-01_ssd_25_25 nsd 4096 110 yes no ready > up 50 system > stg01-01_22_22 nsd 4096 210 no yes ready > up 51 6tnlsasnonrepl desc > stg01-01_23_23 nsd 4096 210 no yes ready > up 52 6tnlsasnonrepl > stg01-01_24_24 nsd 4096 210 no yes ready > up 53 6tnlsasnonrepl > stg01-01_25_25 nsd 4096 210 no yes ready > up 54 6tnlsasnonrepl > stg01-01_26_26 nsd 4096 210 no yes ready > up 55 6tnlsasnonrepl > stg01-01_27_27 nsd 4096 210 no yes ready > up 56 6tnlsasnonrepl > stg01-01_31_31 nsd 4096 210 no yes ready > up 58 6tnlsasnonrepl > stg01-01_32_32 nsd 4096 210 no yes ready > up 59 6tnlsasnonrepl > stg01-01_33_33 nsd 4096 210 no yes ready > up 60 6tnlsasnonrepl > stg01-01_34_34 nsd 4096 210 no yes ready > up 61 6tnlsasnonrepl > stg01-01_35_35 nsd 4096 210 no yes ready > up 62 6tnlsasnonrepl > stg01-01_36_36 nsd 4096 210 no yes ready > up 63 6tnlsasnonrepl > stg01-01_37_37 nsd 4096 210 no yes ready > up 64 6tnlsasnonrepl > stg01-01_38_38 nsd 4096 210 no yes ready > up 65 6tnlsasnonrepl > stg01-01_39_39 nsd 4096 210 no yes ready > up 66 6tnlsasnonrepl > stg01-01_40_40 nsd 4096 210 no yes ready > up 67 6tnlsasnonrepl > stg01-01_41_41 nsd 4096 210 no yes ready > up 68 6tnlsasnonrepl > stg01-01_42_42 nsd 4096 210 no yes ready > up 69 6tnlsasnonrepl > stg01-01_43_43 nsd 4096 210 no yes ready > up 70 6tnlsasnonrepl > stg01-01_44_44 nsd 4096 210 no yes ready > up 71 6tnlsasnonrepl > stg01-01_45_45 nsd 4096 210 no yes ready > up 72 6tnlsasnonrepl > stg01-01_46_46 nsd 4096 210 no yes ready > up 73 6tnlsasnonrepl > stg01-01_47_47 nsd 4096 210 no yes ready > up 74 6tnlsasnonrepl > stg01-01_48_48 nsd 4096 210 no yes ready > up 75 6tnlsasnonrepl > stg01-01_49_49 nsd 4096 210 no yes ready > up 76 6tnlsasnonrepl > stg01-01_50_50 nsd 4096 210 no yes ready > up 77 6tnlsasnonrepl > stg01-01_51_51 nsd 4096 210 no yes ready > up 78 6tnlsasnonrepl > Number of quorum disks: 3 > Read quorum value: 2 > Write quorum value: 2 > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180413_c22c8133_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=2P7GHehsFTuGZ39pBTBsUzcdwo9jkidie2etD8_llas&e= > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=Q5EVJvSbunfieiHUrDHMpC3WAhP1fX2sQFwLLgLFb8Y&e= > > > End of gpfsug-discuss Digest, Vol 75, Issue 23 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 16 09:42:04 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 16 Apr 2018 08:42:04 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? (note it looks like we also have to set this in multi-cluster environments in client clusters as well) Simon From: "Robert.Oesterlin at nuance.com" Date: Friday, 13 April 2018 at 21:17 To: "gpfsug-discuss at spectrumscale.org" Cc: "Simon Thompson (IT Research Support)" Subject: Re: [Replicated and non replicated data Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 16 10:01:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 16 Apr 2018 09:01:41 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Just upgraded the GUI to 4.2.3.8, the bug is now fixed, thanks! Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3D569.E9989650] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3D569.E9989650] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3D569.E9989650] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From Robert.Oesterlin at nuance.com Mon Apr 16 12:34:36 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 16 Apr 2018 11:34:36 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: A DW post from Yuri a few years back talks about it: https://www.ibm.com/developerworks/community/forums/html/topic?id=4cebdb97-3052-4cf2-abb1-462660a1489c Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 From: "Simon Thompson (IT Research Support)" Date: Monday, April 16, 2018 at 3:43 AM To: "Oesterlin, Robert" , gpfsug main discussion list Subject: [EXTERNAL] Re: [Replicated and non replicated data Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Mon Apr 16 13:27:30 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Mon, 16 Apr 2018 13:27:30 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: <0d93fcd2f80d91ba958825c2bdd3d09d@webmail.gpfsug.org> Dear All, This event has been postponed and will now take place on 13TH JUNE. Details are on the link below: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [1] Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Secretary GPFS UG wrote: > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). > > The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Tue Apr 17 09:31:35 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 08:31:35 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz From valdis.kletnieks at vt.edu Tue Apr 17 16:27:51 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 11:27:51 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: <16306.1523978871@turing-police.cc.vt.edu> On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From matthew.robinson02 at gmail.com Tue Apr 17 19:03:57 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Tue, 17 Apr 2018 14:03:57 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Apr 17 19:24:04 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 17 Apr 2018 18:24:04 +0000 Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Tue Apr 17 19:26:18 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 14:26:18 -0400 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Message-ID: <41184.1523989578@turing-police.cc.vt.edu> So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From a.khiredine at meteo.dz Tue Apr 17 21:48:54 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 20:48:54 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 27 In-Reply-To: References: Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0C1A@SDEB-EXC02.meteo.dz> thank you for the answer I use lsscsi the disk is in place "Linux sees the disk" i check the enclosure the disk is in use "I sees the disk :)" and I check the disk the indicator of disk flashes green and I connect to the NetApp DE6600 disk enclosure over telnet and the disk is in place "NetApp sees the disk" if i use this CMD mmchpdisk BB1RGR --pdisk e2d3s11 --identify on Location of pdisk e2d3s11 is not known the only cmd that works is mmchpdisk --suspend OR --diagnose e2d3s11 0, 0 DA2 2560 GiB normal missing/noPath/systemDrain/noRGD/noVCD/noData is change from missing to diagnosing e2d3s11 0, 0 DA2 2560 GiB normal diagnosing/noPath/noVCD and after one or 2 min is change from diagnosing to missing e2d3s11 0, 0 DA2 2582 GiB replaceable missing/noPath/systemDrain/noRGD/noVCD the disk is in place the GNR/GSS/ESS can not see the disk if I can find the file or GNR/GSS/ESS stores the disk location I can add the path that is missing Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] de la part de gpfsug-discuss-request at spectrumscale.org [gpfsug-discuss-request at spectrumscale.org] Envoy? : mardi 17 avril 2018 19:24 ? : gpfsug-discuss at spectrumscale.org Objet : gpfsug-discuss Digest, Vol 75, Issue 27 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: GNR/GSS/ESS pdisk location (valdis.kletnieks at vt.edu) 2. Re: GNR/GSS/ESS pdisk location (Matthew Robinson) 3. Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma (Oesterlin, Robert) ---------------------------------------------------------------------- Message: 1 Date: Tue, 17 Apr 2018 11:27:51 -0400 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <16306.1523978871 at turing-police.cc.vt.edu> Content-Type: text/plain; charset="iso-8859-1" On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: ------------------------------ Message: 2 Date: Tue, 17 Apr 2018 14:03:57 -0400 From: Matthew Robinson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: Content-Type: text/plain; charset="utf-8" Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Tue, 17 Apr 2018 18:24:04 +0000 From: "Oesterlin, Robert" To: gpfsug main discussion list Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Content-Type: text/plain; charset="utf-8" The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 27 ********************************************** From scale at us.ibm.com Tue Apr 17 22:17:29 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 17 Apr 2018 14:17:29 -0700 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? In-Reply-To: <41184.1523989578@turing-police.cc.vt.edu> References: <41184.1523989578@turing-police.cc.vt.edu> Message-ID: Here is the link to our GPFS FAQ which list details on supported versions. https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#linux Search for "Table 30. IBM Spectrum Scale for Linux RedHat kernel support" and it lists the details that you are looking for. Thanks, Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: valdis.kletnieks at vt.edu To: gpfsug-discuss at spectrumscale.org Date: 04/17/2018 12:11 PM Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Sent by: gpfsug-discuss-bounces at spectrumscale.org So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") (See attached file: att7gkev.dat) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: att7gkev.dat Type: application/octet-stream Size: 497 bytes Desc: not available URL: From chair at spectrumscale.org Wed Apr 18 07:51:58 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 18 Apr 2018 07:51:58 +0100 Subject: [gpfsug-discuss] UK Group Live Streams Message-ID: <1228FBE5-7050-443F-9514-446C28683711@spectrumscale.org> Hi All, We?re hoping to have live streaming of today and some of tomorrow?s sessions from London, I?ll post links to the streams on the Spectrum Scale User Group web-site as we go through the day. Note this is the first year we?ll have tried this, so we?ll have to see how it goes! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Apr 18 13:34:22 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 18 Apr 2018 12:34:22 +0000 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Message-ID: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> While I don?t own a DeLorean I work with someone who once fixed one up, which I *think* effectively means I can jump back in time to before the deadline to submit. (And let?s be honest, with the way HPC is going it feels like we have the requisite 1.21GW of power...) However, since I can?t actually time travel back to last week, is there any possibility of an extension? On April 5, 2018 at 05:30:42 EDT, Simon Thompson (Spectrum Scale User Group Chair) wrote: Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 16:03:17 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 11:03:17 -0400 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> Message-ID: No, I think you'll have to find a working DeLorean, get in it and while traveling at 88 mph (141.622 kph) submit your email over an amateur packet radio network .... -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 16:54:45 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 11:54:45 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Message-ID: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From stockf at us.ibm.com Wed Apr 18 18:38:36 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 18 Apr 2018 13:38:36 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: Would the PATH_NAME LIKE option work? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 12:55 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=tM9JZXsRNu6EEhoFlUuWvTLwMsqbDjfDj3NDZ6elACA&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=V6u0XsNxHj4Mp-mu7hCZKv1AD3_GYqU-4KZzvMSQ_MQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 19:00:13 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 14:00:13 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: I suggest you remove any FOR FILESET(...) specifications from your rules and then run mmapplypolicy /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan ... --scope inodespace -P your-policy-rules-file ... See also the (RTFineM) for the --scope option and the Directory argument of the mmapplypolicy command. That is the best, most efficient way to scan all the files that are in a particular inode-space. Also, you must have all filesets of interest "linked" and the file system must be mounted. Notice that "independent" means that the fileset name is used to denote both a fileset and an inode-space, where said inode-space contains the fileset of that name and possibly other "dependent" filesets... IF one wished to search the entire file system for files within several different filesets, one could use rules with FOR FILESET('fileset1','fileset2','and-so-on') Or even more flexibly WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' Or even more powerfully WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 01:00 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 19:51:29 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 14:51:29 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Ok Marc and Frederick, there is hope. I'll conduct more experiments and report back Thanks for the suggestions. Jaime Quoting "Marc A Kaplan" : > I suggest you remove any FOR FILESET(...) specifications from your rules > and then run > > mmapplypolicy > /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan > ... --scope inodespace -P your-policy-rules-file ... > > See also the (RTFineM) for the --scope option and the Directory argument > of the mmapplypolicy command. > > That is the best, most efficient way to scan all the files that are in a > particular inode-space. Also, you must have all filesets of interest > "linked" and the file system must be mounted. > > Notice that "independent" means that the fileset name is used to denote > both a fileset and an inode-space, where said inode-space contains the > fileset of that name and possibly other "dependent" filesets... > > IF one wished to search the entire file system for files within several > different filesets, one could use rules with > > FOR FILESET('fileset1','fileset2','and-so-on') > > Or even more flexibly > > WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' > > Or even more powerfully > > WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') > > > > > > From: "Jaime Pinto" > To: "gpfsug main discussion list" > Date: 04/18/2018 01:00 PM > Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > A few months ago I asked about limits and dynamics of traversing > depended .vs independent filesets on this forum. I used the > information provided to make decisions and setup our new DSS based > gpfs storage system. Now I have a problem I couldn't' yet figure out > how to make it work: > > 'project' and 'scratch' are top *independent* filesets of the same > file system. > > 'proj1', 'proj2' are dependent filesets nested under 'project' > 'scra1', 'scra2' are dependent filesets nested under 'scratch' > > I would like to run a purging policy on all contents under 'scratch' > (which includes 'scra1', 'scra2'), and TSM backup policies on all > contents under 'project' (which includes 'proj1', 'proj2'). > > HOWEVER: > When I run the purging policy on the whole gpfs device (with both > 'project' and 'scratch' filesets) > > * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and > 'scra2' filesets under scratch are excluded (totally unexpected) > > * if I use FOR FILESET('scra1') I get error that scra1 is dependent > fileset (Ok, that is expected) > > * if I use /*FOR FILESET('scratch')*/, all contents under 'project', > 'proj1', 'proj2' are traversed as well, and I don't want that (it > takes too much time) > > * if I use /*FOR FILESET('scratch')*/, and instead of the whole device > I apply the policy to the /scratch mount point only, the policy still > traverses all the content of 'project', 'proj1', 'proj2', which I > don't want. (again, totally unexpected) > > QUESTION: > > How can I craft the syntax of the mmapplypolicy in combination with > the RULE filters, so that I can traverse all the contents under the > 'scratch' independent fileset, including the nested dependent filesets > 'scra1','scra2', and NOT traverse the other independent filesets at > all (since this takes too much time)? > > Thanks > Jaime > > > PS: FOR FILESET('scra*') does not work. > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= > > ************************************ > --- > Jaime Pinto - Storage Analyst > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.ca > University of Toronto > 661 University Ave. (MaRS), Suite 1140 > Toronto, ON, M5G1M1 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= > > > > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From makaplan at us.ibm.com Wed Apr 18 22:22:22 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 17:22:22 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Message-ID: It's more than hope. It works just as I wrote and documented and tested. Personally, I find the nomenclature for filesets and inodespaces as "independent filesets" unfortunate and leading to misunderstandings and confusion. But that train left the station a few years ago, so we just live with it... -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dirk.Thometzek at rohde-schwarz.com Thu Apr 19 08:44:00 2018 From: Dirk.Thometzek at rohde-schwarz.com (Dirk Thometzek) Date: Thu, 19 Apr 2018 07:44:00 +0000 Subject: [gpfsug-discuss] Career Opportunity Message-ID: <79f9ca3347214203b37003ac5dc288c7@rohde-schwarz.com> Dear all, I am working with a development team located in Hanover, Germany. Currently we are looking for a Spectrum Scale professional with long term experience to support our team in a senior development position. If you are interested, please send me a private message to: dirk.thometzek at rohde-schwarz.com Best regards, Dirk Thometzek Product Management File Based Media Solutions [RS_Logo_cyan_rgb - Klein] Rohde & Schwarz GmbH & Co. KG Pf. 80 14 69, D-81614 Muenchen Abt. MU Phone: +49 511 67807-0 Gesch?ftsf?hrung / Executive Board: Christian Leicher (Vorsitzender / Chairman), Peter Riedel Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRA 16 270 Pers?nlich haftender Gesellschafter / Personally Liable Partner: RUSEG Verwaltungs-GmbH | Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRB 7 534 | Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: DE 130 256 683 | Elektro-Altger?te Register (EAR) / WEEE Register No.: DE 240 437 86 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 6729 bytes Desc: image001.jpg URL: From delmard at br.ibm.com Thu Apr 19 14:37:14 2018 From: delmard at br.ibm.com (Delmar Demarchi) Date: Thu, 19 Apr 2018 11:37:14 -0200 Subject: [gpfsug-discuss] API - listing quotas Message-ID: Hello Experts. I'm trying to collect information from Fileset Quotas, using the API. I'm using this link as reference: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_apiv2version2.htm Do you know about any issue with Scale 5.0.x and API? Or what I have change in my command to collect this infos? Following the instruction on knowledge Center, we tried to list, using GET, the FILESET Quota but only USR and GRP were reported. Listing all quotas (using GET also), I found my quota there. See my sample: curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/filesets/sicredi/quotas " { "quotas" : [ { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 454, "quotaType" : "GRP" }, { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 501, "quotaType" : "USR" } ], "status" : { "code" : 200, "message" : "The request finished successfully." } }[root at lbsgpfs05 ~]# curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/quotas" { "quotas" : [ { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 0, "blockQuota" : 0, "blockUsage" : 512, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 1, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 0, "objectName" : "root", "quotaId" : 366, "quotaType" : "FILESET" }, { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 6598656, "blockQuota" : 6598656, "blockUsage" : 5670208, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 5, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 1, "objectName" : "sicredi", "quotaId" : 367, "quotaType" : "FILESET" } "status" : { "code" : 200, "message" : "The request finished successfully." } } mmlsquota -j sicredi fs1 --block-size auto Block Limits | File Limits Filesystem type blocks quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs1 FILESET 5.408G 6.293G 6.293G 0 none | 5 0 0 0 none mmrepquota -a *** Report for USR GRP FILESET quotas on fs1 Block Limits | File Limits Name fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace entryType sicredi root FILESET 5670208 6598656 6598656 0 none | 5 0 0 0 none e Regards, | Abrazos, | Atenciosamente, Delmar Demarchi .'. Power and Storage Services Specialist Phone: 55-19-2132-9469 | Mobile: 55-19-9 9792-1323 E-mail: delmard at br.ibm.com www.ibm.com/systems/services/labservices -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 2022 bytes Desc: not available URL: From A.Wolf-Reber at de.ibm.com Thu Apr 19 14:56:24 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Thu, 19 Apr 2018 13:56:24 +0000 Subject: [gpfsug-discuss] API - listing quotas In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729038.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729039.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729040.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._2_37157E80371579900049F11E83258274.jpg Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_371596C0371591D00049F11E83258274.gif Type: image/gif Size: 2022 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Fri Apr 20 15:01:55 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Fri, 20 Apr 2018 14:01:55 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 20 15:12:11 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 20 Apr 2018 14:12:11 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale In-Reply-To: References: Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6@bham.ac.uk> Sorry, it was a typo from my side. The talks that are missing we are chasing for copies of the slides that we can release. Simon From: on behalf of "Renar.Grunenberg at huk-coburg.de" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Friday, 20 April 2018 at 15:02 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Sun Apr 22 13:18:23 2018 From: nick.savva at adventone.com (Nick Savva) Date: Sun, 22 Apr 2018 12:18:23 +0000 Subject: [gpfsug-discuss] AFM cache re-link Message-ID: Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB's across the link that are already there. Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From sannaik2 at in.ibm.com Sun Apr 22 17:01:14 2018 From: sannaik2 at in.ibm.com (Sandeep Naik1) Date: Sun, 22 Apr 2018 21:31:14 +0530 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Sun Apr 22 18:07:01 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Sun, 22 Apr 2018 17:07:01 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz>, Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0D0D@SDEB-EXC02.meteo.dz> Hi sannaik after doing some research in the Cluster I find the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 I use the old drive of pdisk e2d3s11 to pdisk e3d3s06 now the pdisk e3d3s06 is in the wrong location SV30708502-3-11 ---- mmlspdisk e3d3s06 name = "e3d3s06" device = "/dev/sdfa,/dev/sdir" location = "SV30708502-3-11" userLocation = "Rack ess1 U11-14, Enclosure 1818-80E-SV30708502 Drawer 3 Slot 11" ---- and the pdisk e2d3s11 is without location mmlspdisk e2d3s11 name = "e2d3s11" device = " " location = "" userLocation = "" --- if i use the script replace-at-location for e3d3s06 SV25304899-3-6 replace-at-location BB1RGL e3d3s06 SV25304899-3-6 replace-at-location: error: pdisk e3d3s06 of RG BB1RGL is in location SV30708502-3-11, not SV25304899-3-6. Check the pdisk name and location code before continuing. if i use the script replace-at-location for e3d3s06 SV30708502-3-11 replace-at-location BB1RGL e3d3s06 SV30708502-3-11 location SV30708502-3-11 has a location if i use replace-at-location BB1RGR e2d3s11 SV30708502-3-11 Disk descriptor for /dev/sdfc,/dev/sdiq refers to an existing pdisk. the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 the disk found in location SV30708502-3-11 is not a blank disk because is a lready used by e3d3s06 why e3d3s06 is take the place of e2d3s11 and is stil working Thanks Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : Sandeep Naik1 [sannaik2 at in.ibm.com] Envoy? : dimanche 22 avril 2018 17:01 ? : atmane khiredine Cc : gpfsug main discussion list Objet : Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= From coetzee.ray at gmail.com Sun Apr 22 23:38:41 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Sun, 22 Apr 2018 23:38:41 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Good evening all I'm working with IBM on a PMR where ganesha is segfaulting or causing kernel panics on one group of CES nodes. We have 12 identical CES nodes split into two groups of 6 nodes each & have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. Only one group started having issues Monday morning where ganesha would segfault and the mounts would move over to the remaining nodes. The remaining nodes then start to fall over like dominos within minutes or hours to the point that all CES nodes are "failed" according to "mmces node list" and the VIP's are unassigned. Recovering the nodes are extremely finicky and works for a few minutes or hours before segfaulting again. Most times a complete stop of Ganesha on all nodes & then only starting it on two random nodes allow mounts to recover for a while. None of the following has helped: A reboot of all nodes. Refresh CCR config file with mmsdrrestore Remove/add CES from nodes. Reinstall GPFS & protocol rpms Update to 5.0.0-2 Fresh reinstall of a node Network checks out with no dropped packets on either data or export networks. The only temporary fix so far has been to downrev ganesha to 2.3.2 from 2.5.3 on the affected nodes. While waiting for IBM development, has anyone seen something similar maybe? Kind regards Ray Coetzee On Sat, Apr 21, 2018 at 12:00 PM, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) > 2. Re: UK Meeting - tooling Spectrum Scale > (Simon Thompson (IT Research Support)) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 20 Apr 2018 14:01:55 +0000 > From: "Grunenberg, Renar" > To: "'gpfsug-discuss at spectrumscale.org'" > > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: > Content-Type: text/plain; charset="utf-8" > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/91e3d84d/attachment-0001.html> > > ------------------------------ > > Message: 2 > Date: Fri, 20 Apr 2018 14:12:11 +0000 > From: "Simon Thompson (IT Research Support)" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > Sorry, it was a typo from my side. > > The talks that are missing we are chasing for copies of the slides that we > can release. > > Simon > > From: on behalf of " > Renar.Grunenberg at huk-coburg.de" > Reply-To: "gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > Date: Friday, 20 April 2018 at 15:02 > To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: > > 09561 96-44110 > > Telefax: > > 09561 96-44104 > > E-Mail: > > Renar.Grunenberg at huk-coburg.de > > Internet: > > www.huk.de > > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/0b8e9ffa/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 34 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 00:02:09 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 01:02:09 +0200 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Yes, I've been struggelig with something similiar this week. Ganesha dying with SIGABRT -- nothing else logged. After catching a few coredumps, it has been identified as a problem with some udp-communication during mounts from solaris clients. Disabling udp as transport on the shares serverside didn't help. It was suggested to use "mount -o tcp" or whatever the solaris version of this is -- but we haven't tested this. So far the downgrade to v2.3.2 has been our workaround. PMR: 48669,080,678 -jf On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee wrote: > Good evening all > > I'm working with IBM on a PMR where ganesha is segfaulting or causing > kernel panics on one group of CES nodes. > > We have 12 identical CES nodes split into two groups of 6 nodes each & > have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. > > Only one group started having issues Monday morning where ganesha would > segfault and the mounts would move over to the remaining nodes. > The remaining nodes then start to fall over like dominos within minutes or > hours to the point that all CES nodes are "failed" according to "mmces node > list" and the VIP's are unassigned. > > Recovering the nodes are extremely finicky and works for a few minutes or > hours before segfaulting again. > Most times a complete stop of Ganesha on all nodes & then only starting it > on two random nodes allow mounts to recover for a while. > > None of the following has helped: > A reboot of all nodes. > Refresh CCR config file with mmsdrrestore > Remove/add CES from nodes. > Reinstall GPFS & protocol rpms > Update to 5.0.0-2 > Fresh reinstall of a node > Network checks out with no dropped packets on either data or export > networks. > > The only temporary fix so far has been to downrev ganesha to 2.3.2 from > 2.5.3 on the affected nodes. > > While waiting for IBM development, has anyone seen something similar maybe? > > Kind regards > > Ray Coetzee > > > > On Sat, Apr 21, 2018 at 12:00 PM, spectrumscale.org> wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> >> Today's Topics: >> >> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >> 2. Re: UK Meeting - tooling Spectrum Scale >> (Simon Thompson (IT Research Support)) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Fri, 20 Apr 2018 14:01:55 +0000 >> From: "Grunenberg, Renar" >> To: "'gpfsug-discuss at spectrumscale.org'" >> >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: >> Content-Type: text/plain; charset="utf-8" >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: 09561 96-44110 >> Telefax: 09561 96-44104 >> E-Mail: Renar.Grunenberg at huk-coburg.de >> Internet: www.huk.de >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/91e3d84d/attachment-0001.html> >> >> ------------------------------ >> >> Message: 2 >> Date: Fri, 20 Apr 2018 14:12:11 +0000 >> From: "Simon Thompson (IT Research Support)" >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >> Content-Type: text/plain; charset="utf-8" >> >> Sorry, it was a typo from my side. >> >> The talks that are missing we are chasing for copies of the slides that >> we can release. >> >> Simon >> >> From: on behalf of " >> Renar.Grunenberg at huk-coburg.de" >> Reply-To: "gpfsug-discuss at spectrumscale.org" < >> gpfsug-discuss at spectrumscale.org> >> Date: Friday, 20 April 2018 at 15:02 >> To: "gpfsug-discuss at spectrumscale.org" >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: >> >> 09561 96-44110 >> >> Telefax: >> >> 09561 96-44104 >> >> E-Mail: >> >> Renar.Grunenberg at huk-coburg.de >> >> Internet: >> >> www.huk.de >> >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/0b8e9ffa/attachment-0001.html> >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> End of gpfsug-discuss Digest, Vol 75, Issue 34 >> ********************************************** >> > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coetzee.ray at gmail.com Mon Apr 23 00:23:55 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Mon, 23 Apr 2018 00:23:55 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Hi Jan-Frode We've been told the same regarding mounts using UDP. Our exports are already explicitly configured for TCP and the client's fstab's set to use TCP. It would be infuriating if the clients are trying UDP first irrespective of the mount options configured. Why the problem started specifically last week for both of us is interesting. Kind regards Ray Coetzee Mob: +44 759 704 7060 Skype: ray.coetzee Email: coetzee.ray at gmail.com On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust wrote: > > Yes, I've been struggelig with something similiar this week. Ganesha dying > with SIGABRT -- nothing else logged. After catching a few coredumps, it has > been identified as a problem with some udp-communication during mounts from > solaris clients. Disabling udp as transport on the shares serverside didn't > help. It was suggested to use "mount -o tcp" or whatever the solaris > version of this is -- but we haven't tested this. So far the downgrade to > v2.3.2 has been our workaround. > > PMR: 48669,080,678 > > > -jf > > > On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee > wrote: > >> Good evening all >> >> I'm working with IBM on a PMR where ganesha is segfaulting or causing >> kernel panics on one group of CES nodes. >> >> We have 12 identical CES nodes split into two groups of 6 nodes each & >> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >> released. >> >> Only one group started having issues Monday morning where ganesha would >> segfault and the mounts would move over to the remaining nodes. >> The remaining nodes then start to fall over like dominos within minutes >> or hours to the point that all CES nodes are "failed" according to >> "mmces node list" and the VIP's are unassigned. >> >> Recovering the nodes are extremely finicky and works for a few minutes or >> hours before segfaulting again. >> Most times a complete stop of Ganesha on all nodes & then only starting >> it on two random nodes allow mounts to recover for a while. >> >> None of the following has helped: >> A reboot of all nodes. >> Refresh CCR config file with mmsdrrestore >> Remove/add CES from nodes. >> Reinstall GPFS & protocol rpms >> Update to 5.0.0-2 >> Fresh reinstall of a node >> Network checks out with no dropped packets on either data or export >> networks. >> >> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >> 2.5.3 on the affected nodes. >> >> While waiting for IBM development, has anyone seen something similar >> maybe? >> >> Kind regards >> >> Ray Coetzee >> >> >> >> On Sat, Apr 21, 2018 at 12:00 PM, > umscale.org> wrote: >> >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>> 2. Re: UK Meeting - tooling Spectrum Scale >>> (Simon Thompson (IT Research Support)) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>> From: "Grunenberg, Renar" >>> To: "'gpfsug-discuss at spectrumscale.org'" >>> >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: >>> Content-Type: text/plain; charset="utf-8" >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: 09561 96-44110 >>> Telefax: 09561 96-44104 >>> E-Mail: Renar.Grunenberg at huk-coburg.de >>> Internet: www.huk.de >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/91e3d84d/attachment-0001.html> >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>> From: "Simon Thompson (IT Research Support)" >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>> Content-Type: text/plain; charset="utf-8" >>> >>> Sorry, it was a typo from my side. >>> >>> The talks that are missing we are chasing for copies of the slides that >>> we can release. >>> >>> Simon >>> >>> From: on behalf of " >>> Renar.Grunenberg at huk-coburg.de" >>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>> gpfsug-discuss at spectrumscale.org> >>> Date: Friday, 20 April 2018 at 15:02 >>> To: "gpfsug-discuss at spectrumscale.org" >> > >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: >>> >>> 09561 96-44110 >>> >>> Telefax: >>> >>> 09561 96-44104 >>> >>> E-Mail: >>> >>> Renar.Grunenberg at huk-coburg.de >>> >>> Internet: >>> >>> www.huk.de >>> >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/0b8e9ffa/attachment-0001.html> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>> ********************************************** >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 06:00:26 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 05:00:26 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: It started for me after upgrade from v4.2.x.x to 5.0.0.1 with RHEL7.4. Strangely not immediately, but 2 days after the upgrade (wednesday evening CET). Also I have some doubts that mount -o tcp will help, since TCP should already be the default transport. Have asked for if we can rather block this serverside using iptables. But, I expect we should get a fix soon, and we?ll stick with v2.3.2 until that. -jf man. 23. apr. 2018 kl. 01:23 skrev Ray Coetzee : > Hi Jan-Frode > We've been told the same regarding mounts using UDP. > Our exports are already explicitly configured for TCP and the client's > fstab's set to use TCP. > It would be infuriating if the clients are trying UDP first irrespective > of the mount options configured. > > Why the problem started specifically last week for both of us is > interesting. > > Kind regards > > Ray Coetzee > Mob: +44 759 704 7060 > > Skype: ray.coetzee > > Email: coetzee.ray at gmail.com > > > On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust > wrote: > >> >> Yes, I've been struggelig with something similiar this week. Ganesha >> dying with SIGABRT -- nothing else logged. After catching a few coredumps, >> it has been identified as a problem with some udp-communication during >> mounts from solaris clients. Disabling udp as transport on the shares >> serverside didn't help. It was suggested to use "mount -o tcp" or whatever >> the solaris version of this is -- but we haven't tested this. So far the >> downgrade to v2.3.2 has been our workaround. >> >> PMR: 48669,080,678 >> >> >> -jf >> >> >> On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee >> wrote: >> >>> Good evening all >>> >>> I'm working with IBM on a PMR where ganesha is segfaulting or causing >>> kernel panics on one group of CES nodes. >>> >>> We have 12 identical CES nodes split into two groups of 6 nodes each & >>> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >>> released. >>> >>> Only one group started having issues Monday morning where ganesha would >>> segfault and the mounts would move over to the remaining nodes. >>> The remaining nodes then start to fall over like dominos within minutes >>> or hours to the point that all CES nodes are "failed" according to >>> "mmces node list" and the VIP's are unassigned. >>> >>> Recovering the nodes are extremely finicky and works for a few minutes >>> or hours before segfaulting again. >>> Most times a complete stop of Ganesha on all nodes & then only starting >>> it on two random nodes allow mounts to recover for a while. >>> >>> None of the following has helped: >>> A reboot of all nodes. >>> Refresh CCR config file with mmsdrrestore >>> Remove/add CES from nodes. >>> Reinstall GPFS & protocol rpms >>> Update to 5.0.0-2 >>> Fresh reinstall of a node >>> Network checks out with no dropped packets on either data or export >>> networks. >>> >>> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >>> 2.5.3 on the affected nodes. >>> >>> While waiting for IBM development, has anyone seen something similar >>> maybe? >>> >>> Kind regards >>> >>> Ray Coetzee >>> >>> >>> >>> On Sat, Apr 21, 2018 at 12:00 PM, < >>> gpfsug-discuss-request at spectrumscale.org> wrote: >>> >>>> Send gpfsug-discuss mailing list submissions to >>>> gpfsug-discuss at spectrumscale.org >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> or, via email, send a message with subject or body 'help' to >>>> gpfsug-discuss-request at spectrumscale.org >>>> >>>> You can reach the person managing the list at >>>> gpfsug-discuss-owner at spectrumscale.org >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of gpfsug-discuss digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>>> 2. Re: UK Meeting - tooling Spectrum Scale >>>> (Simon Thompson (IT Research Support)) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>>> From: "Grunenberg, Renar" >>>> To: "'gpfsug-discuss at spectrumscale.org'" >>>> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: 09561 96-44110 >>>> Telefax: 09561 96-44104 >>>> E-Mail: Renar.Grunenberg at huk-coburg.de >>>> Internet: www.huk.de >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/91e3d84d/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>>> From: "Simon Thompson (IT Research Support)" >>>> To: gpfsug main discussion list >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Sorry, it was a typo from my side. >>>> >>>> The talks that are missing we are chasing for copies of the slides that >>>> we can release. >>>> >>>> Simon >>>> >>>> From: on behalf of " >>>> Renar.Grunenberg at huk-coburg.de" >>>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Date: Friday, 20 April 2018 at 15:02 >>>> To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: >>>> >>>> 09561 96-44110 >>>> >>>> Telefax: >>>> >>>> 09561 96-44104 >>>> >>>> E-Mail: >>>> >>>> Renar.Grunenberg at huk-coburg.de >>>> >>>> Internet: >>>> >>>> www.huk.de >>>> >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/0b8e9ffa/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>>> ********************************************** >>>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Apr 23 11:56:19 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 23 Apr 2018 16:26:19 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 15:10:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 14:10:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback Message-ID: Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 23 16:08:14 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 23 Apr 2018 15:08:14 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of "luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 20:54:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 19:54:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: > My very unconsidered and unsupported suggestion would be to edit mmfsfuncs > on your test cluster and see if it?s actually implemented further in the > code ? > > > > Simon > > > > *From: * on behalf of " > luke.raimbach at googlemail.com" > *Reply-To: *"gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > *Date: *Monday, 23 April 2018 at 15:11 > *To: *"gpfsug-discuss at spectrumscale.org" > > *Subject: *[gpfsug-discuss] afmPrepopEnd Callback > > > > Good Afternoon AFM Experts, > > > > I looked in the manual for afmPreopopEnd event variables I can extract to > log something useful after a prefetch event completes. Here is the manual > entry: > > > > %prepopAlreadyCachedFiles > > Specifies the number of files that are cached. > > These number of files are not read into cache > > because data is same between cache and home. > > > > However, when I try to install a callback like this, I get the associated > error: > > > > # mmaddcallback afmCompletionReport --command > /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName > %filesetName %prepopCompletedReads %prepopFailedReads > %prepopAlreadyCachedFiles %prepopData" > > mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was > specified. > > mmaddcallback: Command failed. Examine previous error messages to > determine cause. > > > > I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three > %prepop variables listed: > > > > %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; > > %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; > > %prepopdata ) validCallbackVariable="%prepopData";; > > > > Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? > > > > Unusual to see the manual ahead of the code ;) > > > > Cheers, > > Luke > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Tue Apr 24 07:47:54 2018 From: nick.savva at adventone.com (Nick Savva) Date: Tue, 24 Apr 2018 06:47:54 +0000 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:38:17 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:08:17 +0530 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Luke, This issue has been fixed now. You could either request efix or try workaround as suggested by Simon. The following parameters are supported. prepopCompletedReads prepopFailedReads prepopData This one is missing from the mmfsfuncs and is fixed now. prepopAlreadyCachedFiles ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 04/24/2018 01:25 AM Subject: Re: [gpfsug-discuss] afmPrepopEnd Callback Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of " luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=CKY14hxZ-5Ur87lPVdFwwcpuP1lfw-0_vyYhZCcf1pk&s=C058esOcmGSwBjnUblCLIJEpF4CKsXAos0Ap57R6A4Q&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:42:25 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:12:25 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: RO cache filesets doesn't support failover command. Is NICKTESTFSET RO mode fileset ? >The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. mmafmctl failover/resync commands does not remove extra files at home, if home is empty this won't be an issue. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: gpfsug main discussion list Date: 04/24/2018 12:18 PM Subject: Re: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" < gpfsug-discuss at spectrumscale.org> Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ZSOnMkeNsw6v92UHjeMBC3XPHfpzZlHBMAOJcNpXuNE&s=dZGOYMPF40W5oLiOu-cyilyYzFr4tWalJWKjo1D7PsQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Apr 24 10:20:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 24 Apr 2018 09:20:41 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a "dpnd" next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 11:42:47 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 12:42:47 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Message-ID: Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From makaplan at us.ibm.com Tue Apr 24 13:38:16 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:38:16 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 13:49:05 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:49:05 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 16:08:03 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:08:03 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, the system is not in action yet. I am just to plan the migrations right now. premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx): /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Tue Apr 24 16:20:14 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:20:14 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: (Sorry, vi commands and my mailing client do not co-operate well ...Pls forgive the pre-mature posting just sent. ) Hi, the system is not in action yet. I am just to plan the migrations right now. 1) + 2) premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx), executed once per hour: RULE EXTERNAL POOL 'hsm_aggr_cosnn' EXEC '/opt/hpss/bin/ghi_migrate' OPTS '-a -c nn' /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Internal Migration was originally intended like RULE "pool0_to_pool1" MIGRATE FROM POOL 'pool0' TO POOL 'pool1' WHERE (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '720' MINUTES) to be ran once per day (plus a thrshold-policy to prevent filling up of the relative small pool0 should something go wrong) 3) + 4) as that is not yet set up - no answer. I would like to prevent such things to happen in the first place, therefore asking. 5) I do not understand your question. I am planning the migration set up, and a suppose there might be the risk for trouble when doing internal migrations while haveing data pre-migrated to external. Indeed I am not a developer of the ILM stuff in Scale, so I cannot fully foresee what'll happen. Therefore asking. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Tue Apr 24 17:00:39 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 12:00:39 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: I added support to mmapplypolicy & co for HPSS (and TSM/HSM) years ago. AFAIK, it works pretty well, and pretty much works in a not-too-surprising way. I believe that besides the Spectrum Scale Admin and Command docs there are some "red book"s to help, and some IBM support people know how to use these features. When using mmapplypolicy: a) Migration and pre-migration to an external pool requires at least two rules: An EXTERNAL POOL rule to define the external pool, say 'xpool' and a MIGRATE ... TO POOL 'xpool' rule. b) Migration between GPFS native pools requires at least one MIGRATE ... FROM POOL 'apool' TO POOL 'bpool' rule. c) During any one execution of mmapplypolicy any one particular file will be subject of at most one MIGRATE rule. In other words file x will be either (pre)MIGRATEd to an external pool. OR MIGRATED between gpfs pools. BUT not both. (Hmm... well you could write a custom migration script and name that script in your EXTERNAL POOL rule and do anything you like to each file x that is chosen for "external" MIGRATion... But still just one MIGRATE rule governs file x.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 22:10:31 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 17:10:31 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Wed Apr 25 00:10:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 01:10:52 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, Marc, thx. I understand that being premigrated to an external pool should not affect the internal migration of a file. FYI: This is not the typical "gold - silver - bronze" setup with a one-dimensional migration path. Instead, one of the internal pools (pool0) is used to receive files written in very small records, the other (pool1) is the "normal" pool and receives all other files. Files written to pool0 should move to pool1 once they are closed (i.e. complete), but pool 0 has enough capacity to live without off-migration to pool1 for a few days, thus I'd thought to keep the frequency of that migration to not more than once per day. The external pool serves as a remote async mirror to achieve some resiliency against FS failures and also unintentional file deletion (metadata / SOBAR backups and file listings to keep the HPSS coordinates of GPFS files are done regularly), only in the long run data will be purged from pool1. Thus, migration to external should be done in shorter intervals. Sounds like I can go ahead without hesitation. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 23:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From valdis.kletnieks at vt.edu Wed Apr 25 01:09:52 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 24 Apr 2018 20:09:52 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: <108483.1524614992@turing-police.cc.vt.edu> On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From UWEFALKE at de.ibm.com Wed Apr 25 08:48:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 09:48:18 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: <108483.1524614992@turing-police.cc.vt.edu> References: <108483.1524614992@turing-police.cc.vt.edu> Message-ID: Hi, we rely on some scheme of file names. Splitting by path / fileset does not work here as small- and large-record data have to be co-located. Small-record files will only be recognised if carrying some magic strings in the file name. This is not a normal user system, but ingests data generated automatically, and thus a systematic naming of files is possible to a large extent. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Date: 25/04/2018 02:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) [attachment "attnayq3.dat" deleted by Uwe Falke/Germany/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 09:46:40 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 10:46:40 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Message-ID: Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano From r.sobey at imperial.ac.uk Wed Apr 25 10:01:33 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 09:01:33 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Apr 25 10:42:08 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 25 Apr 2018 09:42:08 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Wed Apr 25 11:44:46 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 10:44:46 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: Message-ID: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Hallo Ivano, we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). # The query port number defaults to 9084. queryport = "9084" query2port = "9094" We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo Gesendet: Mittwoch, 25. April 2018 10:47 An: gpfsug main discussion list Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 12:37:02 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 13:37:02 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Renar.Grunenberg at huk-coburg.de Wed Apr 25 13:11:53 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 12:11:53 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: <0815ca423fad48f9b3f149cd2eb1b143@SMXRF105.msg.hukrf.de> Hallo Ivano We use filter only. For cpu_user ->node = pm_filter($byNode) Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: Ivano Talamo [mailto:Ivano.Talamo at psi.ch] Gesendet: Mittwoch, 25. April 2018 13:37 An: gpfsug main discussion list ; Grunenberg, Renar Betreff: Re: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From david_johnson at brown.edu Wed Apr 25 13:14:27 2018 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 25 Apr 2018 08:14:27 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson > On Apr 25, 2018, at 5:42 AM, Daniel Kidger wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Apr 25 13:35:00 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 12:35:00 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! Main reason for us NOT to use dependent filesets is lack of snapshotting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of david_johnson at brown.edu Sent: 25 April 2018 13:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? A related question though: In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. Daniel [IBM Storage Professional Badge] [Image removed by sender.] Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list > Cc: Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Date: Wed, Apr 25, 2018 10:01 AM Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ~WRD000.jpg Type: image/jpeg Size: 823 bytes Desc: ~WRD000.jpg URL: From david_johnson at brown.edu Wed Apr 25 14:07:19 2018 From: david_johnson at brown.edu (David Johnson) Date: Wed, 25 Apr 2018 09:07:19 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > <~WRD000.jpg> > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org '" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Wed Apr 25 15:26:08 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 25 Apr 2018 10:26:08 -0400 Subject: [gpfsug-discuss] Encryption and ISKLM Message-ID: <22132.1524666368@turing-police.cc.vt.edu> We're running GPFS 4.2.3.7 with encryption on disk, LTFS/EE 1.2.6.2 with encryption on tape, and ISKLM 2.6.0.2 to manage the keys. I'm in the middle of researching RHEL patches on the key servers. Do I want to stay at 2.6.0.2, or go to a later 2.6, or jump to 2.7 or 3.0? Not seeing a lot of guidance on that topic.... From truongv at us.ibm.com Wed Apr 25 18:44:16 2018 From: truongv at us.ibm.com (Truong Vu) Date: Wed, 25 Apr 2018 13:44:16 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: >>> There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered Regarding the above query, it was answered on 5 Feb, 2018 > Subject: Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5? > Date: Mon, Feb 5, 2018 2:56 PM > Quoting "Truong Vu" : > >> >> Hi Jamie, >> >> The limits are the same in 5.0.0. We'll look into the FAQ. >> >> Thanks, >> Tru. BTW, the FAQ has been has been tweak a bit in this area. Thanks, Tru. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 04/25/2018 09:09 AM Subject: gpfsug-discuss Digest, Vol 75, Issue 48 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Converting a dependent fileset to independent (David Johnson) ---------------------------------------------------------------------- Message: 1 Date: Wed, 25 Apr 2018 09:07:19 -0400 From: David Johnson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Content-Type: text/plain; charset="utf-8" Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > <~WRD000.jpg> < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org < mailto:gpfsug-discuss at spectrumscale.org>'" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= >_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=s-Q5O2mSZkgTv9Vw1sikGpoIyxbhCqQ0mpMD-M_8f_E&e= > -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.p7s&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=TM2ZRp5gYFZ5sI3I0Obdf5h-aNBKNzm9tuWX3rqepaM&e= > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= End of gpfsug-discuss Digest, Vol 75, Issue 48 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ulmer at ulmer.org Thu Apr 26 03:48:23 2018 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 25 Apr 2018 22:48:23 -0400 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Message-ID: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Thu Apr 26 07:36:13 2018 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Thu, 26 Apr 2018 09:36:13 +0300 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? In-Reply-To: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> References: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> Message-ID: Hi You knew the answer, still is no. https://www.mail-archive.com/gpfsug-discuss at spectrumscale.org/msg02249.html -- Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations Luis Bolinches Consultant IT Specialist Mobile Phone: +358503112585 https://www.youracclaim.com/user/luis-bolinches "If you always give you will always have" -- Anonymous From: Stephen Ulmer To: gpfsug main discussion list Date: 26/04/2018 05:58 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Sent by: gpfsug-discuss-bounces at spectrumscale.org I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=1mZ896psa5caYzBeaugTlc7TtRejJp3uvKYxas3S7Xc&m=8_UIPpKhNk91nrRD8-6YIFZZXAX8-cxWiEUSTFLM_rY&s=AMlIQVIzjj6hG0agQvN2AAev3cj2MXe1AvqEpxMvnNU&e= Ellei edell? ole toisin mainittu: / Unless stated otherwise above: Oy IBM Finland Ab PL 265, 00101 Helsinki, Finland Business ID, Y-tunnus: 0195876-3 Registered in Finland -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:20:22 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:20:22 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 15:35:29 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 15:35:29 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Apr 26 15:40:52 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 26 Apr 2018 10:40:52 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > > > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > > > > > Bob Oesterlin > > > Sr Principal Storage Engineer, Nuance > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > Nathan?Harper?//?IT Systems Lead > > > e:?nathan.harper at cfms.org.uk???t:?0117 906 1104??m:? 0787 551 0891??w:?www.cfms.org.uk > CFMS Services Ltd?//?Bristol & Bath Science Park?//?Dirac Crescent?//?Emersons Green?//?Bristol?//?BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office //?43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:51:30 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:51:30 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: Hi Lohit, Nathan Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "valleru at cbio.mskcc.org" Reply-To: gpfsug main discussion list Date: Thursday, April 26, 2018 at 9:45 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Nathan Harper // IT Systems Lead [Image removed by sender.] e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR [Image removed by sender.] CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Thu Apr 26 15:51:19 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 26 Apr 2018 14:51:19 +0000 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: , <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: An HTML attachment was scrubbed... URL: From yguvvala at cambridgecomputer.com Thu Apr 26 15:53:58 2018 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Thu, 26 Apr 2018 10:53:58 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: Message-ID: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> I am interested to learn this too. So please add me sending a direct mail. Thanks, Yugi > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert wrote: > > Hi Lohit, Nathan > > Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > From: on behalf of "valleru at cbio.mskcc.org" > Reply-To: gpfsug main discussion list > Date: Thursday, April 26, 2018 at 9:45 AM > To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > We do run Singularity + GPFS, on our production HPC clusters. > Most of the time things are fine without any issues. > > However, i do see a significant performance loss when running some applications on singularity containers with GPFS. > > As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) > When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. > I am yet to raise a PMR about this with IBM. > I have not seen performance degradation for any other kind of IO, but i am not sure. > > Regards, > Lohit > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 16:25:54 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 16:25:54 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: Happy to share on the list in case anyone else finds it useful: We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. On 26 April 2018 at 15:53, Yugendra Guvvala wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > Hi Lohit, Nathan > > > > Would you be willing to share some more details about your setup? We are > just getting started here and I would like to hear about what your > configuration looks like. Direct email to me is fine, thanks. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > > *From: * on behalf of " > valleru at cbio.mskcc.org" > *Reply-To: *gpfsug main discussion list > *Date: *Thursday, April 26, 2018 at 9:45 AM > *To: *gpfsug main discussion list > *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > > > We do run Singularity + GPFS, on our production HPC clusters. > > Most of the time things are fine without any issues. > > > > However, i do see a significant performance loss when running some > applications on singularity containers with GPFS. > > > > As of now, the applications that have severe performance issues with > singularity on GPFS - seem to be because of ?mmap io?. (Deep learning > applications) > > When i run the same application on bare metal, they seem to have a huge > difference in GPFS IO when compared to running on singularity containers. > > I am yet to raise a PMR about this with IBM. > > I have not seen performance degradation for any other kind of IO, but i am > not sure. > > > Regards, > Lohit > > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , > wrote: > > We are running on a test system at the moment, and haven't run into any > issues yet, but so far it's only been 'hello world' and running FIO. > > > > I'm interested to hear about experience with MPI-IO within Singularity. > > > > On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: > > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > > *Nathan* *Harper* // IT Systems Lead > > > > [image: Image removed by sender.] > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > > > > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > [image: Image removed by sender.] > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Apr 26 16:31:32 2018 From: david_johnson at brown.edu (David Johnson) Date: Thu, 26 Apr 2018 11:31:32 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Regarding MPI-IO, how do you mean ?building the applications against GPFS?? We try to advise our users about things to avoid, but we have some poster-ready ?chaos monkeys? as well, who resist guidance. What apps do your users favor? Molpro is one of our heaviest apps right now. Thanks, ? ddj > On Apr 26, 2018, at 11:25 AM, Nathan Harper wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala > wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert > wrote: > >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> From: > on behalf of "valleru at cbio.mskcc.org " > >> Reply-To: gpfsug main discussion list > >> Date: Thursday, April 26, 2018 at 9:45 AM >> To: gpfsug main discussion list > >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) >> >> When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper >, wrote: >> >> >> We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> >> Nathan Harper // IT Systems Lead >> >> >> >> >> >> e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR >> >> >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From nathan.harper at cfms.org.uk Thu Apr 26 17:00:56 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 17:00:56 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Message-ID: We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 26 19:08:48 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 26 Apr 2018 18:08:48 +0000 Subject: [gpfsug-discuss] Pool migration and replicate Message-ID: Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon From olaf.weiser at de.ibm.com Thu Apr 26 19:53:42 2018 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 26 Apr 2018 11:53:42 -0700 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From vborcher at linkedin.com Thu Apr 26 19:59:38 2018 From: vborcher at linkedin.com (Vanessa Borcherding) Date: Thu, 26 Apr 2018 18:59:38 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <690DC273-833D-419F-84A0-7EE2EC7700C1@linkedin.biz> Hi All, In my previous life at Weill Cornell, I benchmarked Singularity pretty extensively for bioinformatics applications on a GPFS 4.2 cluster, and saw virtually no overhead whatsoever. However, I did not allow MPI jobs for those workloads, so that may be the key differentiator here. You may wish to reach out to Greg Kurtzer and his team too - they're super responsive on github and have a slack channel that you can join. His email address is gmkurtzer at gmail.com. Vanessa ?On 4/26/18, 9:01 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of gpfsug-discuss-request at spectrumscale.org" wrote: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Singularity + GPFS (Nathan Harper) ---------------------------------------------------------------------- Message: 1 Date: Thu, 26 Apr 2018 17:00:56 +0100 From: Nathan Harper To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Singularity + GPFS Message-ID: Content-Type: text/plain; charset="utf-8" We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 56 ********************************************** From makaplan at us.ibm.com Thu Apr 26 21:30:14 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 26 Apr 2018 16:30:14 -0400 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: No need to specify REPLICATE(1), but no harm either. No need to specify a FROM POOL, unless you want to restrict the set of files considered. (consider a system with more than two pools...) If a file is already in the target (TO) POOL, then no harm, we just skip over that file. From: "Simon Thompson (IT Research Support)" To: gpfsug main discussion list Date: 04/26/2018 02:09 PM Subject: [gpfsug-discuss] Pool migration and replicate Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=9Ko588DKk_71GheOwRqmDO1vVI24OTUvUBdYv8YHIbU&s=04zxf_-EsPu_LN--gsPx7GEPRsqUW7jIZ1Biov8R3mY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Apr 27 09:40:44 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 27 Apr 2018 10:40:44 +0200 Subject: [gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive In-Reply-To: References: <0081EB235765E14395278B9AE1DF341846510A@MBX214.d.ethz.ch> <4AD44D34-5275-4ADB-8CC7-8E80170DDA7F@brown.edu> Message-ID: Alternative solution we're trying... Create the file /etc/systemd/system/gpfs.service.d/delay.conf containing: [Service] ExecStartPre=/bin/sleep 60 Then I expect we should have long enough delay for infiniband to start before starting gpfs.. -jf On Fri, Mar 16, 2018 at 1:05 PM, Frederick Stock wrote: > I have my doubts that mmdiag can be used in this script. In general the > guidance is to avoid or be very careful with mm* commands in a callback due > to the potential for deadlock. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > stockf at us.ibm.com > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 03/16/2018 04:30 AM > > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports > tobecomeactive > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Thanks Olaf, but we don't use NetworkManager on this cluster.. > > I now created this simple script: > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > #! /bin/bash - > # > # Fail mmstartup if not all configured IB ports are active. > # > # Install with: > # > # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail > --event preStartup --sync --onerror shutdown > # > > for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f > 4- -d " ") > do > grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state > || exit 1 > done > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > > which I haven't tested, but assume should work. Suggestions for > improvements would be much appreciated! > > > > -jf > > > On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <*olaf.weiser at de.ibm.com* > > wrote: > > you can try : > systemctl enable NetworkManager-wait-online > ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' > '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From: Jan-Frode Myklebust <*janfrode at tanso.net* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/15/2018 06:18 PM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > becomeactive > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > I found some discussion on this at > *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25* > and > there it's claimed that none of the callback events are early enough to > resolve this. That we need a pre-preStartup trigger. Any idea if this has > changed -- or is the callback option then only to do a "--onerror > shutdown" if it has failed to connect IB ? > > > On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > You could also use the GPFS prestartup callback (mmaddcallback) to execute > a script synchronously that waits for the IB ports to become available > before returning and allowing GPFS to continue. Not systemd integrated but > it should work. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: *david_johnson at brown.edu* > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/08/2018 07:34 AM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > become active > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > > Until IBM provides a solution, here is my workaround. Add it so it runs > before the gpfs script, I call it from our custom xcat diskless boot > scripts. Based on rhel7, not fully systemd integrated. YMMV! > > Regards, > ? ddj > ??- > [ddj at storage041 ~]$ cat /etc/init.d/ibready > #! /bin/bash > # > # chkconfig: 2345 06 94 > # /etc/rc.d/init.d/ibready > # written in 2016 David D Johnson (ddj *brown.edu* > > ) > # > ### BEGIN INIT INFO > # Provides: ibready > # Required-Start: > # Required-Stop: > # Default-Stop: > # Description: Block until infiniband is ready > # Short-Description: Block until infiniband is ready > ### END INIT INFO > > RETVAL=0 > if [[ -d /sys/class/infiniband ]] > then > IBDEVICE=$(dirname $(grep -il infiniband > /sys/class/infiniband/*/ports/1/link* | head -n 1)) > fi > # See how we were called. > case "$1" in > start) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo -n "Polling for InfiniBand link up: " > for (( count = 60; count > 0; count-- )) > do > if grep -q ACTIVE $IBDEVICE/state > then > echo ACTIVE > break > fi > echo -n "." > sleep 5 > done > if (( count <= 0 )) > then > echo DOWN - $0 timed out > fi > fi > ;; > stop|restart|reload|force-reload|condrestart|try-restart) > ;; > status) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo "$IBDEVICE is $(< $IBDEVICE/state) $(< > $IBDEVICE/rate)" > else > echo "No IBDEVICE found" > fi > ;; > *) > echo "Usage: ibready {start|stop|status|restart| > reload|force-reload|condrestart|try-restart}" > exit 2 > esac > exit ${RETVAL} > ???? > > -- ddj > Dave Johnson > > On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) < > *marc.caubet at psi.ch* > wrote: > > Hi all, > > with autoload = yes we do not ensure that GPFS will be started after the > IB link becomes up. Is there a way to force GPFS waiting to start until IB > ports are up? This can be probably done by adding something like > After=network-online.target and Wants=network-online.target in the systemd > file but I would like to know if this is natively possible from the GPFS > configuration. > > Thanks a lot, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > WHGA/036 > 5232 Villigen PSI > Switzerland > > Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067> > E-Mail: *marc.caubet at psi.ch* > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s= > veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Mon Apr 30 22:11:35 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 30 Apr 2018 17:11:35 -0400 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Message-ID: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 30 22:57:17 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 30 Apr 2018 21:57:17 +0000 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts In-Reply-To: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> References: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Message-ID: You have been able to do this for some time, though I think it's only just supported. We've been exporting remote mounts since CES was added. At some point we've had two storage clusters supplying data and at least 3 remote file-systems exported over NFS and SMB. One thing to watch, be careful if your CES root is on a remote fs, as if that goes away, so do all CES exports. We do have CES root on a remote fs and it works, just be aware... Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valleru at cbio.mskcc.org [valleru at cbio.mskcc.org] Sent: 30 April 2018 22:11 To: gpfsug main discussion list Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit From knop at us.ibm.com Mon Apr 2 04:16:25 2018 From: knop at us.ibm.com (Felipe Knop) Date: Sun, 1 Apr 2018 23:16:25 -0400 Subject: [gpfsug-discuss] sublocks per block in GPFS 5.0 In-Reply-To: References: <68905b2c-8b1a-4a3d-8ded-c5aa56b765aa@Spark><18518530-0d1f-4937-b2ec-9c16c6c80995@Spark> Message-ID: Folks, Also quoting a previous post: Thanks Mark, I did not know, we could explicitly mention sub-block size when creating File system. It is no-where mentioned in the ?man mmcrfs?. Is this a new GPFS 5.0 feature? Also, i see from the ?man mmcrfs? that the default sub-block size for 8M and 16M is 16K. Specifying the number of subblocks per block or the subblock size in mmcrfs is not currently supported. The subblock size is automatically chosen based on the block size, as described in 'Table 1. Block sizes and subblock sizes' in 'man mmcrfs'. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 03/30/2018 02:48 PM Subject: Re: [gpfsug-discuss] sublocks per block in GPFS 5.0 Sent by: gpfsug-discuss-bounces at spectrumscale.org Look at my example, again, closely. I chose the blocksize as 16M and subblock size as 4K and the inodesize as 1K.... Developer works is a good resource, but articles you read there may be incomplete or contain mistakes. The official IBM Spectrum Scale cmd and admin guide documents, are "trustworthy" but may not be perfect in all respects. "Trust but Verify" and YMMV. ;-) As for why/how to choose "good sizes", that depends what objectives you want to achieve, and "optimal" may depend on what hardware you are running. Run your own trials and/or ask performance experts. There are usually "tradeoffs" and OTOH when you get down to it, some choices may not be all-that-important in actual deployment and usage. That's why we have defaults values - try those first and leave the details and tweaking aside until you have good reason ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=oNT2koCZX0xmWlSlLblR9Q&m=6_33HG_HPw9JkUKuyY_SrveiPQ_bnA4JHZ0F7l01ohc&s=HLsts8ySRm-SVYLUNhCt2SxsoP3Ph02ehKmGnqpXbPc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From scale at us.ibm.com Mon Apr 2 08:11:54 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Mon, 2 Apr 2018 12:41:54 +0530 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: Hi Alexander, Markus, Can you please try to answer the below query. Or else forward this to the right folks. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Altenburger Ingo (ID SD)" To: "gpfsug-discuss at spectrumscale.org" Date: 03/29/2018 05:57 PM Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' Sent by: gpfsug-discuss-bounces at spectrumscale.org We were very hopeful to replace our storage provisioning automation based on cli commands with the new functions provided in REST API. Since it seems that almost all protocol related commands are already implemented with 5.0.0.1 REST interface, we have still not found an equivalent for mmsmb exportacl list to get the share permissions of a share. Does anybody know that this is already in but not yet documented or is it for sure still not under consideration? Thanks Ingo _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=djjNl-TRGJKujImpbqTzsuNhnILtchzGBzZBdLJbyY0&s=4e6Azge_v1-AApWi_xNPI6V8qSW58ZOxIwFma-A6nss&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 11:41:41 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 11:41:41 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale Message-ID: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page [1]. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From richardb+gpfsUG at ellexus.com Tue Apr 3 12:28:19 2018 From: richardb+gpfsUG at ellexus.com (Richard Booth) Date: Tue, 3 Apr 2018 12:28:19 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: Hi Claire The link at the bottom of your email, doesn't appear to be working. Richard On 3 April 2018 at 12:00, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Transforming Workflows at Scale (Secretary GPFS UG) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 03 Apr 2018 11:41:41 +0100 > From: Secretary GPFS UG > To: gpfsug main discussion list > Subject: [gpfsug-discuss] Transforming Workflows at Scale > Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> > Content-Type: text/plain; charset="us-ascii" > > > > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being > organised by IBM at IBM South Bank, London on 17th April (the day before > the next UK meeting). > > The event has been designed for broadcasters, post production houses and > visual effects organisations, where managing workflows between different > islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM > Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org > > > Links: > ------ > [1] > https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp? > openform&seminar=B223GVES&locale=en_ZZ&cm_mmc= > Email_External-_-Systems_Systems+-+Hybrid+Cloud+ > Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_ > mmca1=000030YP&cm_mmca2=10001939&cvosrc=email. > External.NA&cvo_campaign=000030YP > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180403/302ad054/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 2 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Tue Apr 3 12:56:33 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Tue, 03 Apr 2018 12:56:33 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: References: Message-ID: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Hi Richard, My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [7] If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Richard Booth wrote: > Hi Claire > > The link at the bottom of your email, doesn't appear to be working. > > Richard > > On 3 April 2018 at 12:00, wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> Today's Topics: >> >> 1. Transforming Workflows at Scale (Secretary GPFS UG) >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 03 Apr 2018 11:41:41 +0100 >> From: Secretary GPFS UG >> To: gpfsug main discussion list >> Subject: [gpfsug-discuss] Transforming Workflows at Scale >> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >> Content-Type: text/plain; charset="us-ascii" >> >> Dear all, >> >> There's a Spectrum Scale for media breakfast briefing event being >> organised by IBM at IBM South Bank, London on 17th April (the day before >> the next UK meeting). >> >> The event has been designed for broadcasters, post production houses and >> visual effects organisations, where managing workflows between different >> islands of technology is a major challenge. >> >> If you're interested, you can read more and register at the IBM >> Registration Page [1]. >> >> Thanks, >> -- >> >> Claire O'Toole >> Spectrum Scale/GPFS User Group Secretary >> +44 (0)7508 033896 [2] >> www.spectrumscaleug.org [3] >> >> Links: >> ------ >> [1] >> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [4] >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org [6] >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] >> >> End of gpfsug-discuss Digest, Vol 75, Issue 2 >> ********************************************* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss [1] Links: ------ [1] http://gpfsug.org/mailman/listinfo/gpfsug-discuss [2] tel:%2B44%20%280%297508%20033896 [3] http://www.spectrumscaleug.org [4] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&amp;seminar=B223GVES&amp;locale=en_ZZ&amp;cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&amp;cm_mmca1=000030YP&amp;cm_mmca2=10001939&amp;cvosrc=email.External.NA&amp;cvo_campaign=000030YP [5] http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180403/302ad054/attachment-0001.html [6] http://spectrumscale.org [7] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From A.Wolf-Reber at de.ibm.com Tue Apr 3 16:26:45 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Tue, 3 Apr 2018 15:26:45 +0000 Subject: [gpfsug-discuss] REST API function for 'mmsmb exportacl list' In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780210.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780211.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152274503780212.png Type: image/png Size: 1134 bytes Desc: not available URL: From john.hearns at asml.com Wed Apr 4 10:11:48 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 09:11:48 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance "If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command." -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Apr 4 19:56:56 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 4 Apr 2018 18:56:56 +0000 Subject: [gpfsug-discuss] Dual server NSDs Message-ID: <59DE0638-07DC-4C10-A981-F4EEE6A60D89@nuance.com> Short answer is that if you want to change/remove the NSD server config on an NSD and its part of a file systems, you need to remove it from the file system or unmount the file system. *Thankfully* this is changed in Scale 5.0. In your case (host name change) ? if the IP address of the NSD server stays the same you *may* be OK. Can you put a DNS alias in for the old host name? Well, now that I think about it the old host name will stick around in the config ? so maybe not such a great idea. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of John Hearns Reply-To: gpfsug main discussion list Date: Wednesday, April 4, 2018 at 1:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Dual server NSDs Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Wed Apr 4 11:02:09 2018 From: john.hearns at asml.com (John Hearns) Date: Wed, 4 Apr 2018 10:02:09 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Message-ID: Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6---> RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Wed Apr 4 20:59:56 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 04 Apr 2018 15:59:56 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <49633.1522871996@turing-police.cc.vt.edu> On Wed, 04 Apr 2018 10:02:09 -0000, John Hearns said: > Has anyone done a procedure like this? We recently got to rename all 10 nodes in a GPFS cluster to make the unqualified name unique (turned out that having 2 nodes called 'arnsd1.isb.mgt' and 'arnsd1.vtc.mgt' causes all sorts of confusion). So they got renamed to arnsd1-isb.yadda.yadda and arnsd1-vtc.yadda.yadda. Unmount, did the mmchnsd server list thing, start going through the servers, rename and reboot each one. We did hit a whoopsie because I forgot to fix the list of quorum/manager nodes as we did each node - so don't forget to run mmchnode for each system if/when appropriate... From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 21:50:13 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 20:50:13 +0000 Subject: [gpfsug-discuss] Local event Message-ID: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Apr 4 19:52:33 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 4 Apr 2018 18:52:33 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: References: Message-ID: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Hi John, Yes, you can remove one of the servers and yes, we?ve done it and yes, the documentation is clear and correct. ;-) Last time I did this we were in a full cluster downtime, so unmounting wasn?t an issue. We were changing our network architecture and so the IP addresses of all NSD servers save one were changing. It was a bit ? uncomfortable ? for the brief period of time I had to make the one NSD server the one and only NSD server for ~1 PB of storage! But it worked just fine? HTHAL? Kevin On Apr 4, 2018, at 4:11 AM, John Hearns > wrote: I should say I already have a support ticket open for advice on this issue. We have a filesystem which has NSDs which have two servers defined, for instance: nsd: device=/dev/sdb servers=sn007,sn008 nsd=nsd1 usage=dataOnly Can I remove one of these servers? The object is to upgrade this server and change its hostname, the physical server will stay in place. Has anyone carried out an operation similar to this? I guess the documentation here is quite clear: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance ?If you want to change configuration for a NSD which is already belongs to a file system, you need to unmount the file system before running mmchnsd command.? -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alevin at gmail.com Wed Apr 4 22:39:08 2018 From: alevin at gmail.com (Alex Levin) Date: Wed, 04 Apr 2018 21:39:08 +0000 Subject: [gpfsug-discuss] Dual server NSDs In-Reply-To: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> References: <411E40D4-99AA-4032-B1D7-E16C89FAB0BD@vanderbilt.edu> Message-ID: We are doing the similar procedure right now. Migrating from one group of nsd servers to another. Unfortunately, as I understand, if you can't afford the cluster/filesystem downtime and not ready for 5.0 upgrade yet ( personally I'm not comfortable with ".0" versions of software in production :) ) - the only way to do it is remove disk/nsd from filesystem and add it back with the new servers list. Taking a while , a lot of i/o ... John, in case the single nsd filesystem, I'm afraid, you'll have to unmount it to change .... --Alex On Wed, Apr 4, 2018, 2:25 PM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi John, > > Yes, you can remove one of the servers and yes, we?ve done it and yes, the > documentation is clear and correct. ;-) > > Last time I did this we were in a full cluster downtime, so unmounting > wasn?t an issue. We were changing our network architecture and so the IP > addresses of all NSD servers save one were changing. It was a bit ? > uncomfortable ? for the brief period of time I had to make the one NSD > server the one and only NSD server for ~1 PB of storage! But it worked > just fine? > > HTHAL? > > Kevin > > On Apr 4, 2018, at 4:11 AM, John Hearns wrote: > > I should say I already have a support ticket open for advice on this issue. > We have a filesystem which has NSDs which have two servers defined, for > instance: > nsd: > device=/dev/sdb > servers=sn007,sn008 > nsd=nsd1 > usage=dataOnly > > Can I remove one of these servers? The object is to upgrade this server > and change its hostname, the physical server will stay in place. > Has anyone carried out an operation similar to this? > > I guess the documentation here is quite clear: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20server%20balance > > ?If you want to change configuration for a NSD which is already belongs > to a file system, you need to unmount the file system before running > mmchnsd command.? > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. Neither the sender nor the > company/group of companies he or she represents shall be liable for the > proper and complete transmission of the information contained in this > communication, or for any delay in its receipt. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cf2ffa137afda4368e32708d59a5c513c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636584643653030858&sdata=Wqpqck%2FuCuzJnolVxElWG6Eky5R%2Bsc4tyvEp6we85Sw%3D&reserved=0 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Thu Apr 5 02:57:15 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 03:57:15 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: Hm, you can change the host name of a Scale node. I've done that a while ago on one or two clusters. >From what I remember I'd follow these steps: 1. Upgrade the OS configuring/using the old IP addr/hostname (2. Reinstall Scale) (3. Replay the cluster data on the node) 4. Create an interface with the new IP address on the node (not necessarily connected) 5. Ensure the node is not required for quorum and has currently no mgr role. You might want to stop Scale on the node. 5. mmchnode -N --daemon-interface ; mmchnode -N --admin-interface . Now the node has kind of disappeared, if the new IF is not yet functional, until you bring that IF up (6.) (6. Activate connection to other cluster nodes via new IF) 2. and 3. are required if scale was removed / the system was re-set up from scratch 6. is required if the new IP connection config.ed in 4 is not operational at first (e.g. not yet linked, or routing not yet active, ...) Et voila, the server should be happy again, if stopped before, start up Scale and check. No warranties, But that's how I'd try. As usual: if messing with IP config, be sure to have a back door to the system in case you ground the OS network config . Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: John Hearns To: gpfsug main discussion list Date: 04/04/2018 21:33 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname Sent by: gpfsug-discuss-bounces at spectrumscale.org Following up from my previous email (I should reply to that email I know) What we really want to achieve is changing the FQDN of an existing server. The server will be reinstalled with an updated OS (RHEL 6-? RHEL 7) During the move we wish to change the domain name of the server. So we will be taking the server offline and bringing the same physical server back up with a new domain name. Has anyone done a procedure like this? Thankyou -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Thu Apr 5 03:25:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 5 Apr 2018 04:25:18 +0200 Subject: [gpfsug-discuss] Local event In-Reply-To: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> References: <3C21055E-9268-4679-AB34-6917CAF24087@vanderbilt.edu> Message-ID: Hi Kevin , I suppose the quota check is done when the writing node allocates blocks to write to. mind: the detour via NSD servers is transparent for that layer, GPFS may switch between SCSI/SAN paths to a (direct-.attached) block device and the NSD service via a separate NSD server, both ways are logically similar for the writing node (or should be for your matter). In short: yes, I think you need to roll out your "quota exceeded" call-back to all nodes in the HPC cluster. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 04/04/2018 22:51 Subject: [gpfsug-discuss] Local event Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, According to the man page for mmaddcallback: A local event triggers a callback only on the node on which the event occurred, such as mounting a file system on one of the nodes. We have two GPFS clusters here (well, three if you count our small test cluster). Cluster one has 8 NSD servers and one client, which is used only for tape backup ? i.e. no one logs on to any of the nodes in the cluster. Files on it are accessed one of three ways: 1) CNFS mount to local computer, 2) SAMBA mount to local computer, 3) GPFS multi-cluster remote mount to cluster two. On cluster one there is a user callback for softQuotaExceeded that e-mails the user ? and that we know works. Cluster two has two local GPFS filesystems and over 600 clients natively mounting those filesystems (it?s our HPC cluster). I?m trying to implement a similar callback for softQuotaExceeded events on cluster two as well. I?ve tested the callback by manually running the (Python) script and passing it in the parameters I want and it works - I get the e-mail. Then I added it via mmcallback, but only on the GPFS servers. I did that because I thought that since callbacks work on cluster one with no local access to the GPFS servers that ?local? must mean ?when an NSD server does a write that puts the user over quota?. However, on cluster two the callback is not being triggered. Does this mean that I actually need to install the callback on every node in cluster two? If so, then how / why are callbacks working on cluster one? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Thu Apr 5 10:30:22 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Thu, 05 Apr 2018 10:30:22 +0100 Subject: [gpfsug-discuss] RFE Process ... Burning Issues Message-ID: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 11:09:07 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 10:09:07 +0000 Subject: [gpfsug-discuss] UK April meeting Message-ID: It?s now just two weeks until the UK meeting and we are down to our last few places available. If you were planning on attending, please register now! Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 1 March 2018 at 11:26 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] UK April meeting Hi All, We?ve just posted the draft agenda for the UK meeting in April at: http://www.spectrumscaleug.org/event/uk-2018-user-group-event/ So far, we?ve issued over 50% of the available places, so if you are planning to attend, please do register now! Please register at: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-2018-registration-41489952565?aff=MailingList We?ve also confirmed our evening networking/social event between days 1 and 2 with thanks to our sponsors for supporting this. Please remember that we are currently limiting to two registrations per organisation. We?d like to thank our sponsors from DDN, E8, Ellexus, IBM, Lenovo, NEC and OCF for supporting the event. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Apr 5 14:37:35 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 5 Apr 2018 09:37:35 -0400 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 5 15:27:38 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 5 Apr 2018 14:27:38 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: Message-ID: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: on behalf of "makaplan at us.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Thu Apr 5 16:27:52 2018 From: john.hearns at asml.com (John Hearns) Date: Thu, 5 Apr 2018 15:27:52 +0000 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Thankyou everyone for replies on this issue. Very helpful. We have a test setup with three nodes, although no multi-pathed disks. So I can try out removing and replacing disks servers. I agree with Simon that bringing up a second NIC is probably inviting Murphy in to play merry hell? The option we are envisioning is re-installing the server(s) but leaving them with the existing FQDNs if we can. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support) Sent: Thursday, April 05, 2018 4:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname Yeah that was my thoughts too given Bob said you can update the server list for an NSD device in 5.0. I also thought that bringing up a second nic and changing the name etc could bring a whole world or danger from having split routing and rp_filter (been there, had the weirdness, RDMA traffic continues but admin traffic randomly fails, but hey, if you like the world crashing down around you?.) Simon From: > on behalf of "makaplan at us.ibm.com" > Reply-To: "gpfsug-discuss at spectrumscale.org" > Date: Thursday, 5 April 2018 at 14:37 To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] Dual server NSDs - change of hostname To my mind this is simpler: IF you can mmdelnode without too much suffering, do that. Then reconfigure the host name and whatever else you'd like to do. Then mmaddnode... -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Fri Apr 6 00:12:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Fri, 6 Apr 2018 01:12:52 +0200 Subject: [gpfsug-discuss] Dual server NSDs - change of hostname In-Reply-To: References: <8228031A-3817-4C7A-A4F4-6B31FAA62DB7@bham.ac.uk> Message-ID: Hi John, some last thoughts mmdelnode/mmaddnode is an easy way to move non-NSD servers, but doing so for NSD servers requires to run mmchnsd, and that again requires a downtime for the file system the NSDs are part of (in Scale 4 at least, what we are talking right here). That could only be circumvented by mmdeldisk/mmadddisk the NSDs of the NSD server to be moved (with all the restriping). If that's ok for you go ahead. Else I think you might give the mmchnode way a second thought. I'd stop GPFS on the server to be moved (although that should also be hot-swappable) which should prevent any havoc for Scale and offers you plenty of opportunity to check your final new network set-up, before starting Scale on that renewed node. YMMV, and you might try different methods on your test system of course. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From sjhoward at iu.edu Fri Apr 6 16:20:59 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:20:59 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5 -> 4.x Message-ID: <1523028060.8115.12.camel@iu.edu> Hi All, We were wondering what the group's experiences have been with upgrading export nodes from 3.5, especially those upgrades that involved a transition from home-grown ADS domain integration to the new CES integration piece. Specifcially, we're interested in: 1) ?What changes were necessary to make in your domain to get it to interoperate with CES? 2) ?Any good tips for CES workarounds in the case of domain configuration that cannot be changed? 3) ?Experience with CES user-defined auth mode in particular? ?Has anyone got this mode to work successfullly? Let us know. ?Thanks! Stewart Howard Indiana University From sjhoward at iu.edu Fri Apr 6 16:14:48 2018 From: sjhoward at iu.edu (Howard, Stewart Jameson) Date: Fri, 6 Apr 2018 15:14:48 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Message-ID: <1523027688.8115.6.camel@iu.edu> Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart From r.sobey at imperial.ac.uk Fri Apr 6 17:00:09 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 6 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: Hi Stewart We're running a synchronous replication cluster between our DCs in London and Slough, at a distance of ~63km. The latency is in the order of 700 microseconds over dark fibre. Honestly... it's been a fine experience. We've never had a full connectivity loss mind you, but we have had to shut down one site fully whilst the other one carried on as normal. Mmrestripe afterwards of course. We are running Scale version 4.2.3 and looking at v5. Hardware is IBM v3700 storage, IBM rackmount NSD/CES nodes. The storage is connected via FC. Cheers Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Howard, Stewart Jameson Sent: 06 April 2018 16:15 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. ?Specifically, we're interested in: 1) ?What SS version are you running? 2) ?What hardware are you running it on? 3) ?What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Apr 6 18:42:53 2018 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 6 Apr 2018 17:42:53 +0000 Subject: [gpfsug-discuss] Experiences with Export Node Transition from 3.5-> 4.x In-Reply-To: <1523028060.8115.12.camel@iu.edu> References: <1523028060.8115.12.camel@iu.edu> Message-ID: An HTML attachment was scrubbed... URL: From YARD at il.ibm.com Sat Apr 7 18:27:49 2018 From: YARD at il.ibm.com (Yaron Daniel) Date: Sat, 7 Apr 2018 20:27:49 +0300 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: <1523027688.8115.6.camel@iu.edu> References: <1523027688.8115.6.camel@iu.edu> Message-ID: HI We have few customers than have 2 Sites (Active/Active using SS replication) + 3rd site as Quorum Tie Breaker node. 1) Spectrum Scale 4.2.3.x 2) Lenovo x3650 -M4 connect via FC to SVC (Flash900 as external storage) 3) We run all tests before deliver the system to customer Production. Main items to take into account : 1) What is the latecny you have between the 2 main sites ? 2) What network bandwidth between the 2 sites ? 3) What is the latency to the 3rd site from each site ? 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? 5) Do you plan to use dedicated network for GPFS daemon ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Howard, Stewart Jameson" To: "gpfsug-discuss at spectrumscale.org" Date: 04/06/2018 06:24 PM Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I was wondering what experiences the user group has had with stretch 4.x clusters. Specifically, we're interested in: 1) What SS version are you running? 2) What hardware are you running it on? 3) What has been your experience with testing of site-failover scenarios (e.g., full power loss at one site, interruption of inter- site link). Thanks so much for your help! Stewart _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=yYIveWTR3gNyhJ9KsrodpWApBlpQ29Oi858MuE0Nzsw&s=V42UYnHtEYVK3LvH6i930tzte1qp0sWmiY6Pp1Ep3kg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 11294 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Sun Apr 8 17:21:34 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Sun, 08 Apr 2018 12:21:34 -0400 Subject: [gpfsug-discuss] Experiences with Synchronous Replication, Stretch Clusters, and Spectrum Scale <= 4.2 In-Reply-To: References: <1523027688.8115.6.camel@iu.edu> Message-ID: <230460.1523204494@turing-police.cc.vt.edu> On Sat, 07 Apr 2018 20:27:49 +0300, "Yaron Daniel" said: > Main items to take into account : > 1) What is the latecny you have between the 2 main sites ? > 2) What network bandwidth between the 2 sites ? > 3) What is the latency to the 3rd site from each site ? > 4) Which protocols plan to be used ? Do you have layer2 between the 2 sites , or layer 3 ? > 5) Do you plan to use dedicated network for GPFS daemon ? The answers to most of these questions are a huge "it depends". For instance, the bandwidth needed is dictated by the amount of data being replicated. The cluster I mentioned the other day was filling most of a 10Gbit link while we were importing 5 petabytes of data from our old archive solution, but now often fits its replication needs inside a few hundred mbits/sec. Similarly, the answers to (4) and (5) will depend on what long-haul network infrastructure the customer already has or can purchase. If they have layer 2 capability between the sites, that's an option. If they've just got commodity layer-3, you're designing with layer 3 in mind. If their network has VLAN capability between the sites, or a dedicated link, that will affect the answer for (5). And so on... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From vpuvvada at in.ibm.com Mon Apr 9 05:52:56 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 9 Apr 2018 10:22:56 +0530 Subject: [gpfsug-discuss] AFM-DR Questions In-Reply-To: References: Message-ID: Hi, > - Any reason why we changed the Recovery point objective (RPO) snapshots by 15 minutes to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? AFM DR doesn't require RPO snapshots for replication, it is continuous replication. Unless there is a need for crash consistency snapshots (applications like databases need write ordering), RPO interval 15 minutes simply puts load on the system as they have to created and deleted for every 15 minutes. >- Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? Yes, command "mmpsnap --rpo" can be used to create RPO snapshots. Some users disable RPO on filesets and cron job is used to create RPO snapshots based on requirement. >- In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? If applications can failover to live filesystem, it is not required to restore from the snapshot. Applications which needs crash consistency will restore from the latest snapshot during failover. AFM DR maintains at most 2 RPO snapshots. >- According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are under an existing Independent fileset. Do you see any issues/concerns with this? AFM DR doesn't support dependent filesets. Users won't be allowed to create them or convert to AFM DR fileset if they already exists. ~Venkat (vpuvvada at in.ibm.com) From: "Delmar Demarchi" To: gpfsug-discuss at spectrumscale.org Date: 03/29/2018 07:12 PM Subject: [gpfsug-discuss] AFM-DR Questions Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello experts. We have a Scale project with AFM-DR to be implemented and after read the KC documentation, we have some questions about. - Do you know any reason why we changed the Recovery point objective (RPO) snapshots by 15 to 720 minutes in the version 5.0.0 of IBM Spectrum Scale AFM-DR? - Can we use additional Independent Peer-snapshots to reduce the RPO interval (720 minutes) of IBM Spectrum Scale AFM-DR? - In addition to the above question, can we use these snapshots to update the new primary site after a failover occur for the most up to date snapshot? - According to the documentation, we are not able to replicate Dependent filesets, but if these dependents filesets are part of an existing Independent fileset. Do you see any issues/concerns with this? Thank you in advance. Delmar Demarchi .'. (delmard at br.ibm.com)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ERiLT5aa1e1r1QyLkokJhA1Q5frqqgQ-g90JT0MGQvQ&s=KVjGaS1dG0luvtm0yh4rBpKNbUquTGuf2FSmaNBIOIM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 9 10:00:26 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 09:00:26 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly Message-ID: Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the "Max Inodes" column. I've verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.roth at de.ibm.com Mon Apr 9 10:38:48 2018 From: stefan.roth at de.ibm.com (Stefan Roth) Date: Mon, 9 Apr 2018 11:38:48 +0200 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. |-----------------+----------------------+-------------------------------------------+---------+> |Mit freundlichen | | | || |Gr??en / Kind | | | || |regards | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Stefan Roth | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Spectrum Scale | | | || |GUI Development | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |Phone: |+49-7034-643-1362 | IBM Deutschland | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |E-Mail: |stefan.roth at de.ibm.com| Am Weiher 24 | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | 65451 Kelsterbach | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | Germany | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> | | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| |-----------------+----------------------+-------------------------------------------+---------+> |IBM Deutschland | | | || |Research & | | | || |Development | | | || |GmbH / | | | || |Vorsitzender des | | | || |Aufsichtsrats: | | | || |Martina Koederitz| | | || | | | | || |Gesch?ftsf?hrung:| | | || |Dirk Wittkopp | | | || |Sitz der | | | || |Gesellschaft: | | | || |B?blingen / | | | || |Registergericht: | | | || |Amtsgericht | | | || |Stuttgart, HRB | | | || |243294 | | | || |-----------------+----------------------+-------------------------------------------+---------+> >----------------------------------------------------------------------------------------------------| | | >----------------------------------------------------------------------------------------------------| From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19702371.gif Type: image/gif Size: 156 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19171259.gif Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 19868035.gif Type: image/gif Size: 63 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Apr 9 11:19:50 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 9 Apr 2018 10:19:50 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Thanks Stefan, very interesting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3CFF4.ABF16450] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3CFF4.ABF16450] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3CFF4.ABF16450] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From john.hearns at asml.com Mon Apr 9 15:43:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:43:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so In-Reply-To: References: Message-ID: And I have fixed my own issue... In the chroot environment: mount -t proc /proc /proc Rookie mistake. Head hung in shame. But I beg forgiveness. My first comps Sci lecturer, Jennifer Haselgrove at Glasgow, taught us an essential programming technique on day one. Always discuss your program with your cat. Sit down with him or her, and talk them through the algorithm, and any bugs which you have. It is a very effective technique. I thank you all for being stand-in cats. As an aside, I will not be at the London meeting next week. Would be good to put some faces to names, and to seek out beer. I am sure IBMers can point you all in the correct direction for that. From: John Hearns Sent: Monday, April 09, 2018 4:37 PM To: gpfsug main discussion list Subject: Installer cannot find libdbgwrapper70.so I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at asml.com Mon Apr 9 15:37:21 2018 From: john.hearns at asml.com (John Hearns) Date: Mon, 9 Apr 2018 14:37:21 +0000 Subject: [gpfsug-discuss] Installer cannot find libdbgwrapper70.so Message-ID: I am running the SpectrumScale install package on an chrooted image which is a RHEL 7.3 install (in -text-only mode) It fails with: /usr/lpp/mmfs/4.2.3.7/ibm-java-x86_64-71/jre/bin/java: error while loading shared libraries: libdbgwrapper70.so: cannot open shared object file: In the past I have fixed java issuew with the installer by using the 'alternatives' mechanism to switch to another java. This time this does not work. Ideas please... and thankyou in advance. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Apr 9 18:17:52 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 9 Apr 2018 17:17:52 +0000 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Message-ID: Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Apr 9 18:20:38 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 9 Apr 2018 17:20:38 +0000 Subject: [gpfsug-discuss] Reminder: SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Only a little over a month away! The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. W have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Mon Apr 9 23:51:05 2018 From: nick.savva at adventone.com (Nick Savva) Date: Mon, 9 Apr 2018 22:51:05 +0000 Subject: [gpfsug-discuss] Device mapper Message-ID: Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Apr 10 01:04:12 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 10 Apr 2018 00:04:12 +0000 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <6c952e81c58940a19114ee1c976501e0@jumptrading.com> Hi Nick, You are correct. You need to update the nsddevices file to look in /dev/mapper. Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Nick Savva Sent: Monday, April 09, 2018 5:51 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Device mapper Note: External Email ________________________________ Hi all, Apologies in advance if this has been covered already in discussions. I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. I understand you can also copy the bindings file but I think aliases is probably easier to maintain. However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? Appreciate the help in advance, Nick ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Tue Apr 10 03:27:13 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Tue, 10 Apr 2018 02:27:13 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 2 In-Reply-To: <026b2aa97247b551b28ea13678484a4b@webmail.gpfsug.org> Message-ID: Claire/ Richard et al. The link works for me also, but I agree that the URL is complex and ugly. I am sure there must be a simpler URL with less embedded metadata that could be used? eg. Cutting it down to this appears to still work: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES Daniel Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com > On 3 Apr 2018, at 04:56, Secretary GPFS UG wrote: > > Hi Richard, > > My apologies, that is strange. This is the link and I have checked it works: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP > > If you're still having problems or require further information, please send an e-mail to justine_ive at uk.ibm.com > > Many thanks, > > --- > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org >> On , Richard Booth wrote: >> >> Hi Claire >> >> The link at the bottom of your email, doesn't appear to be working. >> >> Richard >> >>> On 3 April 2018 at 12:00, wrote: >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Transforming Workflows at Scale (Secretary GPFS UG) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Tue, 03 Apr 2018 11:41:41 +0100 >>> From: Secretary GPFS UG >>> To: gpfsug main discussion list >>> Subject: [gpfsug-discuss] Transforming Workflows at Scale >>> Message-ID: <037f89ab466334f83f235f357111a9d6 at webmail.gpfsug.org> >>> Content-Type: text/plain; charset="us-ascii" >>> >>> >>> >>> Dear all, >>> >>> There's a Spectrum Scale for media breakfast briefing event being >>> organised by IBM at IBM South Bank, London on 17th April (the day before >>> the next UK meeting). >>> >>> The event has been designed for broadcasters, post production houses and >>> visual effects organisations, where managing workflows between different >>> islands of technology is a major challenge. >>> >>> If you're interested, you can read more and register at the IBM >>> Registration Page [1]. >>> >>> Thanks, >>> -- >>> >>> Claire O'Toole >>> Spectrum Scale/GPFS User Group Secretary >>> +44 (0)7508 033896 >>> www.spectrumscaleug.org >>> >>> >>> Links: >>> ------ >>> [1] >>> https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 2 >>> ********************************************* >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=3aZjjrv3ym45au9B33YgmVP51qvaHXYad4WRjccMOdk&s=rnsXK8Eibl0HLAElxCQexfrV8ReoB8hOYlkk3PmhqN4&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From tortay at cc.in2p3.fr Tue Apr 10 06:51:25 2018 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Tue, 10 Apr 2018 07:51:25 +0200 Subject: [gpfsug-discuss] Device mapper In-Reply-To: References: Message-ID: <0b9f3629-146f-3720-fda8-3d51c0c37614@cc.in2p3.fr> On 10/04/2018 00:51, Nick Savva wrote: > Hi all, > > Apologies in advance if this has been covered already in discussions. > > I'm building a new spectrum scale cluster and I am trying to get consistent device names across all nodes. I am attempting to use aliases in the multipath.conf which actually works and creates the /dev/mapper/ link. > > I understand you can also copy the bindings file but I think aliases is probably easier to maintain. > > However Spectrum scale will not accept the /dev/mapper device it only looks for dm-X devices that are in the /proc/partitions file. I know SONAS is pointing to Device mapper so there must be a way? > > Im looking at the /var/mmfs/etc/nsddevices is it a case of editing this file to find the /dev/mapper device? > Hello, We're doing this, indeed, using the "nsddevices" script. The names printed by the script must be relative to "/dev". Our script contains the following (our multipath aliases are "nsdXY"): cd /dev && for nsd in mapper/nsd* ; do [ -e $nsd ] && echo "$nsd dmm" done return 0 The meaning of "dmm" is described in "/usr/lpp/mmfs/bin/mmdevdiscover". Lo?c. -- | Lo?c Tortay - IN2P3 Computing Centre | From rohwedder at de.ibm.com Tue Apr 10 08:57:44 2018 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Tue, 10 Apr 2018 09:57:44 +0200 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error In-Reply-To: References: Message-ID: Hello Kevin, it could be that the "hysteresis" parameter is still set to a non zero value. You can check by using the mmhealth thresholds list --verbose command, or of course by using the Monitor>Thresholds page. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 7034 6430190 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 09.04.2018 19:18 Subject: [gpfsug-discuss] GPFS GUI - DataPool_capUtil error Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I?m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I?ve got an issue that I can?t figure out. In my events I see: Event name:pool-data_high_error Component:File SystemEntity type:PoolEntity name: Event time:3/26/18 4:44:10 PM Message:The pool of file system reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level. Cause:The pool reached a nearly exhausted level. User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots. Reporting node: Event type:Active health state of an entity which is monitored by the system. Now this is for a ?capacity? pool ? i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I?ve modified the thresholds: ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity -------------------------------------------------------------------------------------------------------------------------------------------------- InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 MemFree_Rule mem_memfree 50000 100000 low node 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 But it?s still in an ?Error? state. I see that the time of the event is March 26th at 4:44 PM, so I?m thinking this is something that?s just stale, but I can?t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds: Event Parameter Severity Active Since Event Message ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0 What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I?ve searched and searched in the GUI for a way to clear it. I?ve read the ?Monitoring and Managing IBM Spectrum Scale Using the GUI? rebook pretty much cover to cover and haven?t found anything there about how to clear this. Thanks... Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l6AoS-QQpHgDtZkWluGw6Lln0PEOyUeS1ujJR2o1Hjg&s=X6bQXF1YmSSq1QyOkQXHYF1NMhczdJSPtWL4fpjbZ24&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1A990285.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Apr 10 09:55:30 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 10 Apr 2018 08:55:30 +0000 Subject: [gpfsug-discuss] CES SMB export limit Message-ID: Is there a limit to the number of SMB exports we can create in CES? Figures being thrown around here suggest 256 but we'd like to know for sure. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroche at lenovo.com Tue Apr 10 11:13:49 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:13:49 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi Claire, Can I add a registration to the Lenovo listing please? One of our technical architects from Israel would like to attend the event. Can we add: Gilad Berman HPC Architect Lenovo EMEA [Phone]+972-52-2554262 [Email]gberman at lenovo.com To the Attendee list? Thanks, Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Secretary GPFS UG Sent: Tuesday, April 3, 2018 11:42 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Transforming Workflows at Scale Dear all, There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. If you're interested, you can read more and register at the IBM Registration Page. Thanks, -- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 92 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 128 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 7770 bytes Desc: image004.gif URL: From jroche at lenovo.com Tue Apr 10 11:30:37 2018 From: jroche at lenovo.com (Jim Roche) Date: Tue, 10 Apr 2018 10:30:37 +0000 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: Hi All, sorry for the spam?. Finger troubles. ? Jim [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Jim Roche UK HPC Technical Sales Leader Discovery House 18 Bartley Wood Business Park Hook, RG27 9XA Lenovo United Kingdom [Phone]+44 (0)7702 678579 [Email]jroche at lenovo.com Lenovo.com /uk Twitter | Facebook | Instagram | Blogs | Forums [DifferentBetter-Laser] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1899 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 92 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 128 bytes Desc: image006.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.gif Type: image/gif Size: 7770 bytes Desc: image007.gif URL: From carlz at us.ibm.com Tue Apr 10 16:33:54 2018 From: carlz at us.ibm.com (Carl Zetie) Date: Tue, 10 Apr 2018 15:33:54 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com From aaron.s.knister at nasa.gov Tue Apr 10 17:00:09 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Tue, 10 Apr 2018 16:00:09 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior Message-ID: I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:22:46 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:22:46 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: I wonder if this is an artifact of pagepool exhaustion which makes me ask the question-- how do I see how much of the pagepool is in use and by what? I've looked at mmfsadm dump and mmdiag --memory and neither has provided me the information I'm looking for (or at least not in a format I understand). -Aaron On 4/10/18 12:00 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] wrote: > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From makaplan at us.ibm.com Tue Apr 10 17:28:29 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 10 Apr 2018 12:28:29 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cphoffma at uoregon.edu Tue Apr 10 17:18:49 2018 From: cphoffma at uoregon.edu (Chris Hoffman) Date: Tue, 10 Apr 2018 16:18:49 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523377129792.79060@uoregon.edu> ?Hi Stumped, Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes to the same file or different ones? Chris ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] Sent: Tuesday, April 10, 2018 9:00 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Confusing I/O Behavior I hate admitting this but I've found something that's got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don't appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don't understand is why these write requests aren't getting batched up into larger write requests to the underlying disks. If I do something like "df if=/dev/zero of=foo bs=8k" on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn't doing any fsync's and isn't doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Apr 10 17:52:30 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 10 Apr 2018 12:52:30 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523377129792.79060@uoregon.edu> References: <1523377129792.79060@uoregon.edu> Message-ID: Chris, The job runs across multiple nodes and the tinky 8K writes *should* be to different files that are unique per-rank. -Aaron On 4/10/18 12:18 PM, Chris Hoffman wrote: > ?Hi Stumped, > > > Is this MPI job on one machine? Multiple nodes? Are the tiny 8K writes > to the same file or different ones? > > > Chris > > ------------------------------------------------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org > on behalf of Knister, Aaron > S. (GSFC-606.2)[COMPUTER SCIENCE CORP] > *Sent:* Tuesday, April 10, 2018 9:00 AM > *To:* gpfsug main discussion list > *Subject:* [gpfsug-discuss] Confusing I/O Behavior > I hate admitting this but I?ve found something that?s got me stumped. > > We have a user running an MPI job on the system. Each rank opens up > several output files to which it writes ASCII debug information. The net > result across several hundred ranks is an absolute smattering of teeny > tiny I/o requests to te underlying disks which they don?t appreciate. > Performance plummets. The I/o requests are 30 to 80 bytes in size. What > I don?t understand is why these write requests aren?t getting batched up > into larger write requests to the underlying disks. > > If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see > that the nasty unaligned 8k io requests are batched up into nice 1M I/o > requests before they hit the NSD. > > As best I can tell the application isn?t doing any fsync?s and isn?t > doing direct io to these files. > > Can anyone explain why seemingly very similar io workloads appear to > result in well formed NSD I/O in one case and awful I/o in another? > > Thanks! > > -Stumped > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From UWEFALKE at de.ibm.com Tue Apr 10 22:43:30 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 10 Apr 2018 23:43:30 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Hi Aaron, to how many different files do these tiny I/O requests go? Mind that the write aggregates the I/O over a limited time (5 secs or so) and ***per file***. It is for that matter a large difference to write small chunks all to one file or to a large number of individual files . to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes ***per file*** within those 5 secs. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" To: gpfsug main discussion list Date: 10/04/2018 18:09 Subject: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org I hate admitting this but I?ve found something that?s got me stumped. We have a user running an MPI job on the system. Each rank opens up several output files to which it writes ASCII debug information. The net result across several hundred ranks is an absolute smattering of teeny tiny I/o requests to te underlying disks which they don?t appreciate. Performance plummets. The I/o requests are 30 to 80 bytes in size. What I don?t understand is why these write requests aren?t getting batched up into larger write requests to the underlying disks. If I do something like ?df if=/dev/zero of=foo bs=8k? on a node I see that the nasty unaligned 8k io requests are batched up into nice 1M I/o requests before they hit the NSD. As best I can tell the application isn?t doing any fsync?s and isn?t doing direct io to these files. Can anyone explain why seemingly very similar io workloads appear to result in well formed NSD I/O in one case and awful I/o in another? Thanks! -Stumped _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Wed Apr 11 09:22:12 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 11 Apr 2018 08:22:12 +0000 Subject: [gpfsug-discuss] CES SMB export limit In-Reply-To: References: Message-ID: Just the 1000 SMB shares limit was what I wanted but the other info was useful, thanks Carl. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie Sent: 10 April 2018 16:34 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES SMB export limit Hi Richard, KC says "IBM Spectrum Scale? can host a maximum of 1,000 SMB shares. There must be less than 3,000 SMB connections per protocol node and less than 20,000 SMB connections across all protocol nodes." Are those the numbers you are looking for? Carl Zetie Offering Manager for Spectrum Scale, IBM (540) 882 9353 ][ Research Triangle Park carlz at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Wed Apr 11 11:14:21 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 11 Apr 2018 11:14:21 +0100 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: <1523441661.19449.153.camel@strath.ac.uk> On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > Hi Aaron,? > to how many different files do these tiny I/O requests go? > > Mind that the write aggregates the I/O over a limited time (5 secs or > so)?and ***per file***.? > It is for that matter a large difference to write small chunks all to > one? > file or to a large number of individual files . > to fill a??1 MiB buffer you need about 13100 chunks of??80Bytes > ***per? > file*** within those 5 secs.? > Something else to bear in mind is that you might be using a library that converts everything into putchar's. I have seen this in the past with Office on a Mac platform and made performance saving a file over SMB/NFS appalling. I mean really really bad, a?"save as" which didn't do that would take a second or two, a save would take like 15 minutes. To the local disk it was just fine. The GPFS angle is this was all on a self rolled clustered Samba GPFS setup back in the day. Took a long time to track down, and performance turned out to be just as appalling with a real Windows file server. JAB. -- Jonathan A. Buzzard?????????????????????????Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From UWEFALKE at de.ibm.com Wed Apr 11 11:53:36 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 11 Apr 2018 12:53:36 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: <1523441661.19449.153.camel@strath.ac.uk> References: <1523441661.19449.153.camel@strath.ac.uk> Message-ID: It would be interesting in which chunks data arrive at the NSDs -- if those chunks are bigger than the individual I/Os (i.e. multiples of the record sizes), there is some data coalescing going on and it just needs to have its path well paved ... If not, there might be indeed something odd in the configuration. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 gpfsug-discuss-bounces at spectrumscale.org wrote on 11/04/2018 12:14:21: > From: Jonathan Buzzard > To: gpfsug main discussion list > Date: 11/04/2018 12:14 > Subject: Re: [gpfsug-discuss] Confusing I/O Behavior > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > On Tue, 2018-04-10 at 23:43 +0200, Uwe Falke wrote: > > Hi Aaron, > > to how many different files do these tiny I/O requests go? > > > > Mind that the write aggregates the I/O over a limited time (5 secs or > > so) and ***per file***. > > It is for that matter a large difference to write small chunks all to > > one > > file or to a large number of individual files . > > to fill a 1 MiB buffer you need about 13100 chunks of 80Bytes > > ***per > > file*** within those 5 secs. > > > > Something else to bear in mind is that you might be using a library > that converts everything into putchar's. I have seen this in the past > with Office on a Mac platform and made performance saving a file over > SMB/NFS appalling. I mean really really bad, a "save as" which didn't > do that would take a second or two, a save would take like 15 minutes. > To the local disk it was just fine. > > The GPFS angle is this was all on a self rolled clustered Samba GPFS > setup back in the day. Took a long time to track down, and performance > turned out to be just as appalling with a real Windows file server. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Wed Apr 11 12:06:40 2018 From: peserocka at gmail.com (Peter Serocka) Date: Wed, 11 Apr 2018 13:06:40 +0200 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chair at spectrumscale.org Wed Apr 11 12:21:04 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 11 Apr 2018 12:21:04 +0100 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hi All, At the UK meeting next week, we?ve had a speaker slot become available, we?re planning to put in a BoF type session on tooling Spectrum Scale so we have space for a few 3-5 minute quick talks on what people are doing to automate. If you are coming along and interested, please drop me an email. Max of 3 slides! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Wed Apr 11 15:36:29 2018 From: valleru at cbio.mskcc.org (Lohit Valleru) Date: Wed, 11 Apr 2018 10:36:29 -0400 Subject: [gpfsug-discuss] GPFS, MMAP and Pagepool In-Reply-To: References: Message-ID: Hey Sven, This is regarding mmap issues and GPFS. We had discussed previously of experimenting with GPFS 5. I now have upgraded all of compute nodes and NSD nodes to GPFS 5.0.0.2 I am yet to experiment with mmap performance, but before that - I am seeing weird hangs with GPFS 5 and I think it could be related to mmap. Have you seen GPFS ever hang on this syscall? [Tue Apr 10 04:20:13 2018] [] _ZN10gpfsNode_t8mmapLockEiiPKj+0xb5/0x140 [mmfs26] I see the above ,when kernel hangs and throws out a series of trace calls. I somehow think the above trace is related to processes hanging on GPFS forever. There are no errors in GPFS however. Also, I think the above happens only when the mmap threads go above a particular number. We had faced a similar issue in 4.2.3 and it was resolved in a patch to 4.2.3.2 . At that time , the issue happened when mmap threads go more than worker1threads. According to the ticket - it was a mmap race condition that GPFS was not handling well. I am not sure if this issue is a repeat and I am yet to isolate the incident and test with increasing number of mmap threads. I am not 100 percent sure if this is related to mmap yet but just wanted to ask you if you have seen anything like above. Thanks, Lohit On Feb 22, 2018, 3:59 PM -0500, Sven Oehme , wrote: > Hi Lohit, > > i am working with ray on a mmap performance improvement right now, which most likely has the same root cause as yours , see -->??http://gpfsug.org/pipermail/gpfsug-discuss/2018-January/004411.html > the thread above is silent after a couple of back and rorth, but ray and i have active communication in the background and will repost as soon as there is something new to share. > i am happy to look at this issue after we finish with ray's workload if there is something missing, but first let's finish his, get you try the same fix and see if there is something missing. > > btw. if people would share their use of MMAP , what applications they use (home grown, just use lmdb which uses mmap under the cover, etc) please let me know so i get a better picture on how wide the usage is with GPFS. i know a lot of the ML/DL workloads are using it, but i would like to know what else is out there i might not think about. feel free to drop me a personal note, i might not reply to it right away, but eventually. > > thx. sven > > > > On Thu, Feb 22, 2018 at 12:33 PM wrote: > > > Hi all, > > > > > > I wanted to know, how does mmap interact with GPFS pagepool with respect to filesystem block-size? > > > Does the efficiency depend on the mmap read size and the block-size of the filesystem even if all the data is cached in pagepool? > > > > > > GPFS 4.2.3.2 and CentOS7. > > > > > > Here is what i observed: > > > > > > I was testing a user script that uses mmap to read from 100M to 500MB files. > > > > > > The above files are stored on 3 different filesystems. > > > > > > Compute nodes - 10G pagepool and 5G seqdiscardthreshold. > > > > > > 1. 4M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > 2. 1M block size GPFS filesystem as a AFM cache cluster, "with all the required files fully cached" from the above GPFS cluster as home. Data and Metadata together on SSDs > > > 3. 16M block size GPFS filesystem, with separate metadata and data. Data on Near line and metadata on SSDs > > > > > > When i run the script first time for ?each" filesystem: > > > I see that GPFS reads from the files, and caches into the pagepool as it reads, from mmdiag -- iohist > > > > > > When i run the second time, i see that there are no IO requests from the compute node to GPFS NSD servers, which is expected since all the data from the 3 filesystems is cached. > > > > > > However - the time taken for the script to run for the files in the 3 different filesystems is different - although i know that they are just "mmapping"/reading from pagepool/cache and not from disk. > > > > > > Here is the difference in time, for IO just from pagepool: > > > > > > 20s 4M block size > > > 15s 1M block size > > > 40S 16M block size. > > > > > > Why do i see a difference when trying to mmap reads from different block-size filesystems, although i see that the IO requests are not hitting disks and just the pagepool? > > > > > > I am willing to share the strace output and mmdiag outputs if needed. > > > > > > Thanks, > > > Lohit > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Apr 11 17:51:33 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 11 Apr 2018 16:51:33 +0000 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From makaplan at us.ibm.com Wed Apr 11 18:23:02 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 11 Apr 2018 13:23:02 -0400 Subject: [gpfsug-discuss] Confusing I/O Behavior In-Reply-To: References: Message-ID: Good point about "tiny" files going into the inode and system pool. Which reminds one: Generally a bad idea to store metadata in wide striping disk base RAID (Type 5 with spinning media) Do use SSD or similar for metadata. Consider smaller block size for metadata / system pool than regular file data. From: Bryan Banister To: gpfsug main discussion list Date: 04/11/2018 12:51 PM Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Sent by: gpfsug-discuss-bounces at spectrumscale.org Just another thought here. If the debug output files fit in an inode, then these would be handled as metadata updates to the inode, which is typically much smaller than the file system blocksize. Looking at my storage that handles GPFS metadata shows avg KiB/IO at a horrendous 5-12 KiB! HTH, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Peter Serocka Sent: Wednesday, April 11, 2018 6:07 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Confusing I/O Behavior Note: External Email ------------------------------------------------- Let?s keep in mind that line buffering is a concept within the standard C library; if every log line triggers one write(2) system call, and it?s not direct io, then multiple write still get coalesced into few larger disk writes (as with the dd example). A logging application might choose to close(2) a log file after each write(2) ? that produces a different scenario, where the file system might guarantee that the data has been written to disk when close(2) return a success. (Local Linux file systems do not do this with default mounts, but networked filesystems usually do.) Aaron, can you trace your application to see what is going on in terms of system calls? ? Peter > On 2018 Apr 10 Tue, at 18:28, Marc A Kaplan wrote: > > Debug messages are typically unbuffered or "line buffered". If that is truly causing a performance problem AND you still want to collect the messages -- you'll need to find a better way to channel and collect those messages. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=xSLLpVdHbkGieYfTGPIJRMkA1AbwsYteS2lHR4_49ik&s=9BOhyKNgkkbcOv316JZXnRB4HpPK_x2hyLd0d_uLGos&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 13 21:05:53 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 13 Apr 2018 20:05:53 +0000 Subject: [gpfsug-discuss] Replicated and non replicated data Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC@bham.ac.uk> I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon [root at nsd01 ~]# mmlsdisk castles -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- CASTLES_GPFS_DESCONLY01 nsd 512 310 no no ready up 1 system desc stg01-01_3_3 nsd 4096 210 no yes ready down 4 6tnlsas stg01-01_4_4 nsd 4096 210 no yes ready down 5 6tnlsas stg01-01_5_5 nsd 4096 210 no yes ready down 6 6tnlsas stg01-01_6_6 nsd 4096 210 no yes ready down 7 6tnlsas stg01-01_7_7 nsd 4096 210 no yes ready down 8 6tnlsas stg01-01_8_8 nsd 4096 210 no yes ready down 9 6tnlsas stg01-01_9_9 nsd 4096 210 no yes ready down 10 6tnlsas stg01-01_10_10 nsd 4096 210 no yes ready down 11 6tnlsas stg01-01_11_11 nsd 4096 210 no yes ready down 12 6tnlsas stg01-01_12_12 nsd 4096 210 no yes ready down 13 6tnlsas stg01-01_13_13 nsd 4096 210 no yes ready down 14 6tnlsas stg01-01_14_14 nsd 4096 210 no yes ready down 15 6tnlsas stg01-01_15_15 nsd 4096 210 no yes ready down 16 6tnlsas stg01-01_16_16 nsd 4096 210 no yes ready down 17 6tnlsas stg01-01_17_17 nsd 4096 210 no yes ready down 18 6tnlsas stg01-01_18_18 nsd 4096 210 no yes ready down 19 6tnlsas stg01-01_19_19 nsd 4096 210 no yes ready down 20 6tnlsas stg01-01_20_20 nsd 4096 210 no yes ready down 21 6tnlsas stg01-01_21_21 nsd 4096 210 no yes ready down 22 6tnlsas stg01-01_ssd_54_54 nsd 4096 210 yes no ready down 23 system stg01-01_ssd_56_56 nsd 4096 210 yes no ready down 24 system stg02-01_0_0 nsd 4096 110 no yes ready up 25 6tnlsas stg02-01_1_1 nsd 4096 110 no yes ready up 26 6tnlsas stg02-01_2_2 nsd 4096 110 no yes ready up 27 6tnlsas stg02-01_3_3 nsd 4096 110 no yes ready up 28 6tnlsas stg02-01_4_4 nsd 4096 110 no yes ready up 29 6tnlsas stg02-01_5_5 nsd 4096 110 no yes ready up 30 6tnlsas stg02-01_6_6 nsd 4096 110 no yes ready up 31 6tnlsas stg02-01_7_7 nsd 4096 110 no yes ready up 32 6tnlsas stg02-01_8_8 nsd 4096 110 no yes ready up 33 6tnlsas stg02-01_9_9 nsd 4096 110 no yes ready up 34 6tnlsas stg02-01_10_10 nsd 4096 110 no yes ready up 35 6tnlsas stg02-01_11_11 nsd 4096 110 no yes ready up 36 6tnlsas stg02-01_12_12 nsd 4096 110 no yes ready up 37 6tnlsas stg02-01_13_13 nsd 4096 110 no yes ready up 38 6tnlsas stg02-01_14_14 nsd 4096 110 no yes ready up 39 6tnlsas stg02-01_15_15 nsd 4096 110 no yes ready up 40 6tnlsas stg02-01_16_16 nsd 4096 110 no yes ready up 41 6tnlsas stg02-01_17_17 nsd 4096 110 no yes ready up 42 6tnlsas stg02-01_18_18 nsd 4096 110 no yes ready up 43 6tnlsas stg02-01_19_19 nsd 4096 110 no yes ready up 44 6tnlsas stg02-01_20_20 nsd 4096 110 no yes ready up 45 6tnlsas stg02-01_21_21 nsd 4096 110 no yes ready up 46 6tnlsas stg02-01_ssd_22_22 nsd 4096 110 yes no ready up 47 system desc stg02-01_ssd_23_23 nsd 4096 110 yes no ready up 48 system stg02-01_ssd_24_24 nsd 4096 110 yes no ready up 49 system stg02-01_ssd_25_25 nsd 4096 110 yes no ready up 50 system stg01-01_22_22 nsd 4096 210 no yes ready up 51 6tnlsasnonrepl desc stg01-01_23_23 nsd 4096 210 no yes ready up 52 6tnlsasnonrepl stg01-01_24_24 nsd 4096 210 no yes ready up 53 6tnlsasnonrepl stg01-01_25_25 nsd 4096 210 no yes ready up 54 6tnlsasnonrepl stg01-01_26_26 nsd 4096 210 no yes ready up 55 6tnlsasnonrepl stg01-01_27_27 nsd 4096 210 no yes ready up 56 6tnlsasnonrepl stg01-01_31_31 nsd 4096 210 no yes ready up 58 6tnlsasnonrepl stg01-01_32_32 nsd 4096 210 no yes ready up 59 6tnlsasnonrepl stg01-01_33_33 nsd 4096 210 no yes ready up 60 6tnlsasnonrepl stg01-01_34_34 nsd 4096 210 no yes ready up 61 6tnlsasnonrepl stg01-01_35_35 nsd 4096 210 no yes ready up 62 6tnlsasnonrepl stg01-01_36_36 nsd 4096 210 no yes ready up 63 6tnlsasnonrepl stg01-01_37_37 nsd 4096 210 no yes ready up 64 6tnlsasnonrepl stg01-01_38_38 nsd 4096 210 no yes ready up 65 6tnlsasnonrepl stg01-01_39_39 nsd 4096 210 no yes ready up 66 6tnlsasnonrepl stg01-01_40_40 nsd 4096 210 no yes ready up 67 6tnlsasnonrepl stg01-01_41_41 nsd 4096 210 no yes ready up 68 6tnlsasnonrepl stg01-01_42_42 nsd 4096 210 no yes ready up 69 6tnlsasnonrepl stg01-01_43_43 nsd 4096 210 no yes ready up 70 6tnlsasnonrepl stg01-01_44_44 nsd 4096 210 no yes ready up 71 6tnlsasnonrepl stg01-01_45_45 nsd 4096 210 no yes ready up 72 6tnlsasnonrepl stg01-01_46_46 nsd 4096 210 no yes ready up 73 6tnlsasnonrepl stg01-01_47_47 nsd 4096 210 no yes ready up 74 6tnlsasnonrepl stg01-01_48_48 nsd 4096 210 no yes ready up 75 6tnlsasnonrepl stg01-01_49_49 nsd 4096 210 no yes ready up 76 6tnlsasnonrepl stg01-01_50_50 nsd 4096 210 no yes ready up 77 6tnlsasnonrepl stg01-01_51_51 nsd 4096 210 no yes ready up 78 6tnlsasnonrepl Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Apr 13 21:17:11 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 13 Apr 2018 20:17:11 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data Message-ID: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From sxiao at us.ibm.com Sat Apr 14 02:42:28 2018 From: sxiao at us.ibm.com (Steve Xiao) Date: Fri, 13 Apr 2018 21:42:28 -0400 Subject: [gpfsug-discuss] Replicated and non replicated data In-Reply-To: References: Message-ID: What is your unmountOnDiskFail configuration setting on the cluster? You need to set unmountOnDiskFail to meta if you only have metadata replication. Steve Y. Xiao > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 13 Apr 2018 20:05:53 +0000 > From: "Simon Thompson (IT Research Support)" > To: "gpfsug-discuss at spectrumscale.org" > > Subject: [gpfsug-discuss] Replicated and non replicated data > Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > I have a question about file-systems with replicated an non replicated data. > > We have a file-system where metadata is set to copies=2 and data > copies=2, we then use a placement policy to selectively replicate > some data only once based on file-set. We also place the non- > replicated data into a specific pool (6tnlsas) to ensure we know > where it is placed. > > My understanding was that in doing this, if we took the disks with > the non replicated data offline, we?d still have the FS available > for users as the metadata is replicated. Sure accessing a non- > replicated data file would give an IO error, but the rest of the FS > should be up. > > We had a situation today where we wanted to take stg01 offline > today, so tried using mmchdisk stop -d ?. Once we got to about disk > stg01-01_12_12, GPFS would refuse to stop any more disks and > complain about too many disks, similarly if we shutdown the NSD > servers hosting the disks, the filesystem would have an SGPanic and > force unmount. > > First, am I correct in thinking that a FS with non-replicated data, > but replicated metadata should still be accessible (not the non- > replicated data) when the LUNS hosting it are down? > > If so, any suggestions why my FS is panic-ing when we take down the > one set of disks? > > I thought at first we had some non-replicated metadata, tried a > mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but > this didn?t help. > > Running 5.0.0.2 on the NSD server nodes. > > (First time we went round this we didn?t have a FS descriptor disk, > but you can see below that we added this) > > Thanks > > Simon > > [root at nsd01 ~]# mmlsdisk castles -L > disk driver sector failure holds holds storage > name type size group metadata data status > availability disk id pool remarks > ------------ -------- ------ ----------- -------- ----- > ------------- ------------ ------- ------------ --------- > CASTLES_GPFS_DESCONLY01 nsd 512 310 no no > ready up 1 system desc > stg01-01_3_3 nsd 4096 210 no yes ready > down 4 6tnlsas > stg01-01_4_4 nsd 4096 210 no yes ready > down 5 6tnlsas > stg01-01_5_5 nsd 4096 210 no yes ready > down 6 6tnlsas > stg01-01_6_6 nsd 4096 210 no yes ready > down 7 6tnlsas > stg01-01_7_7 nsd 4096 210 no yes ready > down 8 6tnlsas > stg01-01_8_8 nsd 4096 210 no yes ready > down 9 6tnlsas > stg01-01_9_9 nsd 4096 210 no yes ready > down 10 6tnlsas > stg01-01_10_10 nsd 4096 210 no yes ready > down 11 6tnlsas > stg01-01_11_11 nsd 4096 210 no yes ready > down 12 6tnlsas > stg01-01_12_12 nsd 4096 210 no yes ready > down 13 6tnlsas > stg01-01_13_13 nsd 4096 210 no yes ready > down 14 6tnlsas > stg01-01_14_14 nsd 4096 210 no yes ready > down 15 6tnlsas > stg01-01_15_15 nsd 4096 210 no yes ready > down 16 6tnlsas > stg01-01_16_16 nsd 4096 210 no yes ready > down 17 6tnlsas > stg01-01_17_17 nsd 4096 210 no yes ready > down 18 6tnlsas > stg01-01_18_18 nsd 4096 210 no yes ready > down 19 6tnlsas > stg01-01_19_19 nsd 4096 210 no yes ready > down 20 6tnlsas > stg01-01_20_20 nsd 4096 210 no yes ready > down 21 6tnlsas > stg01-01_21_21 nsd 4096 210 no yes ready > down 22 6tnlsas > stg01-01_ssd_54_54 nsd 4096 210 yes no ready > down 23 system > stg01-01_ssd_56_56 nsd 4096 210 yes no ready > down 24 system > stg02-01_0_0 nsd 4096 110 no yes ready > up 25 6tnlsas > stg02-01_1_1 nsd 4096 110 no yes ready > up 26 6tnlsas > stg02-01_2_2 nsd 4096 110 no yes ready > up 27 6tnlsas > stg02-01_3_3 nsd 4096 110 no yes ready > up 28 6tnlsas > stg02-01_4_4 nsd 4096 110 no yes ready > up 29 6tnlsas > stg02-01_5_5 nsd 4096 110 no yes ready > up 30 6tnlsas > stg02-01_6_6 nsd 4096 110 no yes ready > up 31 6tnlsas > stg02-01_7_7 nsd 4096 110 no yes ready > up 32 6tnlsas > stg02-01_8_8 nsd 4096 110 no yes ready > up 33 6tnlsas > stg02-01_9_9 nsd 4096 110 no yes ready > up 34 6tnlsas > stg02-01_10_10 nsd 4096 110 no yes ready > up 35 6tnlsas > stg02-01_11_11 nsd 4096 110 no yes ready > up 36 6tnlsas > stg02-01_12_12 nsd 4096 110 no yes ready > up 37 6tnlsas > stg02-01_13_13 nsd 4096 110 no yes ready > up 38 6tnlsas > stg02-01_14_14 nsd 4096 110 no yes ready > up 39 6tnlsas > stg02-01_15_15 nsd 4096 110 no yes ready > up 40 6tnlsas > stg02-01_16_16 nsd 4096 110 no yes ready > up 41 6tnlsas > stg02-01_17_17 nsd 4096 110 no yes ready > up 42 6tnlsas > stg02-01_18_18 nsd 4096 110 no yes ready > up 43 6tnlsas > stg02-01_19_19 nsd 4096 110 no yes ready > up 44 6tnlsas > stg02-01_20_20 nsd 4096 110 no yes ready > up 45 6tnlsas > stg02-01_21_21 nsd 4096 110 no yes ready > up 46 6tnlsas > stg02-01_ssd_22_22 nsd 4096 110 yes no ready > up 47 system desc > stg02-01_ssd_23_23 nsd 4096 110 yes no ready > up 48 system > stg02-01_ssd_24_24 nsd 4096 110 yes no ready > up 49 system > stg02-01_ssd_25_25 nsd 4096 110 yes no ready > up 50 system > stg01-01_22_22 nsd 4096 210 no yes ready > up 51 6tnlsasnonrepl desc > stg01-01_23_23 nsd 4096 210 no yes ready > up 52 6tnlsasnonrepl > stg01-01_24_24 nsd 4096 210 no yes ready > up 53 6tnlsasnonrepl > stg01-01_25_25 nsd 4096 210 no yes ready > up 54 6tnlsasnonrepl > stg01-01_26_26 nsd 4096 210 no yes ready > up 55 6tnlsasnonrepl > stg01-01_27_27 nsd 4096 210 no yes ready > up 56 6tnlsasnonrepl > stg01-01_31_31 nsd 4096 210 no yes ready > up 58 6tnlsasnonrepl > stg01-01_32_32 nsd 4096 210 no yes ready > up 59 6tnlsasnonrepl > stg01-01_33_33 nsd 4096 210 no yes ready > up 60 6tnlsasnonrepl > stg01-01_34_34 nsd 4096 210 no yes ready > up 61 6tnlsasnonrepl > stg01-01_35_35 nsd 4096 210 no yes ready > up 62 6tnlsasnonrepl > stg01-01_36_36 nsd 4096 210 no yes ready > up 63 6tnlsasnonrepl > stg01-01_37_37 nsd 4096 210 no yes ready > up 64 6tnlsasnonrepl > stg01-01_38_38 nsd 4096 210 no yes ready > up 65 6tnlsasnonrepl > stg01-01_39_39 nsd 4096 210 no yes ready > up 66 6tnlsasnonrepl > stg01-01_40_40 nsd 4096 210 no yes ready > up 67 6tnlsasnonrepl > stg01-01_41_41 nsd 4096 210 no yes ready > up 68 6tnlsasnonrepl > stg01-01_42_42 nsd 4096 210 no yes ready > up 69 6tnlsasnonrepl > stg01-01_43_43 nsd 4096 210 no yes ready > up 70 6tnlsasnonrepl > stg01-01_44_44 nsd 4096 210 no yes ready > up 71 6tnlsasnonrepl > stg01-01_45_45 nsd 4096 210 no yes ready > up 72 6tnlsasnonrepl > stg01-01_46_46 nsd 4096 210 no yes ready > up 73 6tnlsasnonrepl > stg01-01_47_47 nsd 4096 210 no yes ready > up 74 6tnlsasnonrepl > stg01-01_48_48 nsd 4096 210 no yes ready > up 75 6tnlsasnonrepl > stg01-01_49_49 nsd 4096 210 no yes ready > up 76 6tnlsasnonrepl > stg01-01_50_50 nsd 4096 210 no yes ready > up 77 6tnlsasnonrepl > stg01-01_51_51 nsd 4096 210 no yes ready > up 78 6tnlsasnonrepl > Number of quorum disks: 3 > Read quorum value: 2 > Write quorum value: 2 > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180413_c22c8133_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=2P7GHehsFTuGZ39pBTBsUzcdwo9jkidie2etD8_llas&e= > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=Q5EVJvSbunfieiHUrDHMpC3WAhP1fX2sQFwLLgLFb8Y&e= > > > End of gpfsug-discuss Digest, Vol 75, Issue 23 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 16 09:42:04 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 16 Apr 2018 08:42:04 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? (note it looks like we also have to set this in multi-cluster environments in client clusters as well) Simon From: "Robert.Oesterlin at nuance.com" Date: Friday, 13 April 2018 at 21:17 To: "gpfsug-discuss at spectrumscale.org" Cc: "Simon Thompson (IT Research Support)" Subject: Re: [Replicated and non replicated data Add: unmountOnDiskFail=meta To your config. You can add it with ?-I? to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "Simon Thompson (IT Research Support)" Reply-To: gpfsug main discussion list Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we?d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d ?. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but this didn?t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn?t have a FS descriptor disk, but you can see below that we added this) Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Apr 16 10:01:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 16 Apr 2018 09:01:41 +0000 Subject: [gpfsug-discuss] GUI not displaying node info correctly In-Reply-To: References: Message-ID: Just upgraded the GUI to 4.2.3.8, the bug is now fixed, thanks! Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Stefan Roth Sent: 09 April 2018 10:39 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GUI not displaying node info correctly Hello Richard, this is a known GUI bug that will be fixed in 4.2.3-8. Once this is available, just upgrade the GUI rpm. The 4.2.3-8 PTF is not yet available, but it should be in next days. This problem happens to all customers with more than 127 filesets, means you see a max inodes value for the first 127 filesets, but not for newer filesets. Mit freundlichen Gr??en / Kind regards Stefan Roth Spectrum Scale GUI Development [cid:image002.gif at 01D3D569.E9989650] Phone: +49-7034-643-1362 IBM Deutschland [cid:image003.gif at 01D3D569.E9989650] E-Mail: stefan.roth at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany [cid:image002.gif at 01D3D569.E9989650] IBM Deutschland Research & Development GmbH / Vorsitzender des Aufsichtsrats: Martina Koederitz Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [Inactive hide details for "Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets f]"Sobey, Richard A" ---09.04.2018 11:01:01---Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 09.04.2018 11:01 Subject: [gpfsug-discuss] GUI not displaying node info correctly Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, We have a fairly significant number of filesets for which the GUI reports nothing at all in the ?Max Inodes? column. I?ve verified with mmlsfileset -I that the inode limit is set. Has anyone seen this already and had a PMR for it? SS 4.2.3-7. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=FZXEKEUIKKfyO2oYq6outktXzRFzl0eKP2opGp7UNks&s=f3eT53kYib3aoHB5addQ_EyZRmCZM2gtiGsZj6aq2ZM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 166 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 156 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 1851 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 63 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 105 bytes Desc: image005.gif URL: From Robert.Oesterlin at nuance.com Mon Apr 16 12:34:36 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 16 Apr 2018 11:34:36 +0000 Subject: [gpfsug-discuss] [Replicated and non replicated data In-Reply-To: References: <19438566-254D-448A-89AA-E0317AFBFA64@nuance.com> Message-ID: A DW post from Yuri a few years back talks about it: https://www.ibm.com/developerworks/community/forums/html/topic?id=4cebdb97-3052-4cf2-abb1-462660a1489c Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 From: "Simon Thompson (IT Research Support)" Date: Monday, April 16, 2018 at 3:43 AM To: "Oesterlin, Robert" , gpfsug main discussion list Subject: [EXTERNAL] Re: [Replicated and non replicated data Yeah that did it, it was set to the default value of ?no?. What exactly does ?no? mean as opposed to ?yes?? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm Aren?t very forthcoming on this ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From secretary at gpfsug.org Mon Apr 16 13:27:30 2018 From: secretary at gpfsug.org (Secretary GPFS UG) Date: Mon, 16 Apr 2018 13:27:30 +0100 Subject: [gpfsug-discuss] Transforming Workflows at Scale In-Reply-To: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> References: <037f89ab466334f83f235f357111a9d6@webmail.gpfsug.org> Message-ID: <0d93fcd2f80d91ba958825c2bdd3d09d@webmail.gpfsug.org> Dear All, This event has been postponed and will now take place on 13TH JUNE. Details are on the link below: https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP [1] Many thanks, --- Claire O'Toole Spectrum Scale/GPFS User Group Secretary +44 (0)7508 033896 www.spectrumscaleug.org On , Secretary GPFS UG wrote: > Dear all, > > There's a Spectrum Scale for media breakfast briefing event being organised by IBM at IBM South Bank, London on 17th April (the day before the next UK meeting). > > The event has been designed for broadcasters, post production houses and visual effects organisations, where managing workflows between different islands of technology is a major challenge. > > If you're interested, you can read more and register at the IBM Registration Page [1]. > > Thanks, > -- > > Claire O'Toole > Spectrum Scale/GPFS User Group Secretary > +44 (0)7508 033896 > www.spectrumscaleug.org Links: ------ [1] https://www-01.ibm.com/events/wwe/grp/grp309.nsf/Agenda.xsp?openform&seminar=B223GVES&locale=en_ZZ&cm_mmc=Email_External-_-Systems_Systems+-+Hybrid+Cloud+Storage-_-IUK_IUK-_-ME+Spec+Scale+17th+April+BP+&cm_mmca1=000030YP&cm_mmca2=10001939&cvosrc=email.External.NA&cvo_campaign=000030YP -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Tue Apr 17 09:31:35 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 08:31:35 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz From valdis.kletnieks at vt.edu Tue Apr 17 16:27:51 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 11:27:51 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: <16306.1523978871@turing-police.cc.vt.edu> On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From matthew.robinson02 at gmail.com Tue Apr 17 19:03:57 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Tue, 17 Apr 2018 14:03:57 -0400 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Apr 17 19:24:04 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 17 Apr 2018 18:24:04 +0000 Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Tue Apr 17 19:26:18 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 17 Apr 2018 14:26:18 -0400 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Message-ID: <41184.1523989578@turing-police.cc.vt.edu> So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From a.khiredine at meteo.dz Tue Apr 17 21:48:54 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Tue, 17 Apr 2018 20:48:54 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 27 In-Reply-To: References: Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0C1A@SDEB-EXC02.meteo.dz> thank you for the answer I use lsscsi the disk is in place "Linux sees the disk" i check the enclosure the disk is in use "I sees the disk :)" and I check the disk the indicator of disk flashes green and I connect to the NetApp DE6600 disk enclosure over telnet and the disk is in place "NetApp sees the disk" if i use this CMD mmchpdisk BB1RGR --pdisk e2d3s11 --identify on Location of pdisk e2d3s11 is not known the only cmd that works is mmchpdisk --suspend OR --diagnose e2d3s11 0, 0 DA2 2560 GiB normal missing/noPath/systemDrain/noRGD/noVCD/noData is change from missing to diagnosing e2d3s11 0, 0 DA2 2560 GiB normal diagnosing/noPath/noVCD and after one or 2 min is change from diagnosing to missing e2d3s11 0, 0 DA2 2582 GiB replaceable missing/noPath/systemDrain/noRGD/noVCD the disk is in place the GNR/GSS/ESS can not see the disk if I can find the file or GNR/GSS/ESS stores the disk location I can add the path that is missing Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] de la part de gpfsug-discuss-request at spectrumscale.org [gpfsug-discuss-request at spectrumscale.org] Envoy? : mardi 17 avril 2018 19:24 ? : gpfsug-discuss at spectrumscale.org Objet : gpfsug-discuss Digest, Vol 75, Issue 27 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: GNR/GSS/ESS pdisk location (valdis.kletnieks at vt.edu) 2. Re: GNR/GSS/ESS pdisk location (Matthew Robinson) 3. Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma (Oesterlin, Robert) ---------------------------------------------------------------------- Message: 1 Date: Tue, 17 Apr 2018 11:27:51 -0400 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: <16306.1523978871 at turing-police.cc.vt.edu> Content-Type: text/plain; charset="iso-8859-1" On Tue, 17 Apr 2018 08:31:35 -0000, atmane khiredine said: > but no location of pdisk > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" That can't be good. That's just screaming "dead, uncabled, or removed". > WWN = "naa.5000C50056717727" Useful hint where to start if all else fails (see below) > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" So you know where it was previously, and you can't find it now because it's either dead, missing, or there's no fiberchannel path to it. > User response: Check the disk enclosure hardware. Exactly as it says: Check the cabling, check the enclosure for a failed disk, and check if there's now an empty spot where a co-worker "helpfully" removed a bad disk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: ------------------------------ Message: 2 Date: Tue, 17 Apr 2018 14:03:57 -0400 From: Matthew Robinson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Message-ID: Content-Type: text/plain; charset="utf-8" Hi Valdis, Normally the name will indicate the physical location of the disk. So the name of the disk you have listed is "e2d3s11" this is Enclosure 2 Disk shelf 3 Disk slot 11. However, based on the location = "" this is the reason for the failure of the tscommand failure. Normally a recovery group rebuild fixes the issue from what I have seen in the past. On Tue, Apr 17, 2018 at 4:31 AM, atmane khiredine wrote: > dear all, > > I want to understand how GNR/GSS/ESS stores information about the pdisk > location > I looked in the configuration file > > /var/mmfs/gen/mmfsNodeData > /var/mmfs/gen/mmsdrfs > /var/mmfs/gen/mmfs.cfg > > but no location of pdisk > > this is real scenario of unknown location > this is the output from GNR/GSS/ESS > ----------------------------------- > [root at ess1 ~]# mmlspdisk BB1RGR --not-ok > pdisk: > replacementPriority = 2.00 > name = "e2d3s11" > device = "" > recoveryGroup = "BB1RGR" > declusteredArray = "DA2" > state = "missing/noPath/systemDrain/noRGD/noVCD/noData" > capacity = 3000034656256 > freeSpace = 2997887172608 > location = "" > WWN = "naa.5000C50056717727" > server = "ess1-ib0" > reads = 106800946 > writes = 10414075 > IOErrors = 1216 > IOTimeouts = 18 > mediaErrors = 0 > checksumErrors = 0 > pathErrors = 0 > relativePerformance = 1.000 > userLocation = "" > userCondition = "replaceable" > hardware = " " > hardwareType = Rotating 7200 > nPaths = 0 active 0 total > nsdFormatVersion = Unknown > paxosAreaOffset = Unknown > paxosAreaSize = Unknown > logicalBlockSize = 512 > ----------------------------------- > I begin change the Hard disk > mmchcarrier BB1RGR --release --pdisk "e2d3s11" > I have this error > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > i know the location of the Hard disk > i know the location of the Hard disk from old mmlspdisk file > mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" > > Location of pdisk e2d3s11 of recovery group BB1RGR is not known. > I read in the official documentation > 6027-3001 [E] Location of pdisk pdiskName of recovery > group recoveryGroupName is not known. > Explanation: IBM Spectrum Scale is unable to find the > location of the given pdisk. > User response: Check the disk enclosure hardware. > > > Atmane Khiredine > HPC System Administrator | Office National de la M?t?orologie > T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : > a.khiredine at meteo.dz > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Tue, 17 Apr 2018 18:24:04 +0000 From: "Oesterlin, Robert" To: gpfsug main discussion list Subject: [gpfsug-discuss] Only 20 spots left! - SSUG-US Spring meeting - May 16-17th, Cambridge, Ma Message-ID: Content-Type: text/plain; charset="utf-8" The registration for the Spring meeting of the SSUG-USA is now open. This is a Free two-day and will include a large number of Spectrum Scale updates and breakout tracks. We have limited meeting space so please register early if you plan on attending. Registration and agenda details: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2018-meeting-tickets-43662759489 DATE AND TIME Wed, May 16, 2018, 9:00 AM ? Thu, May 17, 2018, 5:00 PM EDT LOCATION IBM Cambridge Innovation Center One Rogers Street Cambridge, MA 02142-1203 Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 27 ********************************************** From scale at us.ibm.com Tue Apr 17 22:17:29 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Tue, 17 Apr 2018 14:17:29 -0700 Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? In-Reply-To: <41184.1523989578@turing-police.cc.vt.edu> References: <41184.1523989578@turing-police.cc.vt.edu> Message-ID: Here is the link to our GPFS FAQ which list details on supported versions. https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#linux Search for "Table 30. IBM Spectrum Scale for Linux RedHat kernel support" and it lists the details that you are looking for. Thanks, Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: valdis.kletnieks at vt.edu To: gpfsug-discuss at spectrumscale.org Date: 04/17/2018 12:11 PM Subject: [gpfsug-discuss] RedHat kernel support for GPFS 4.2.3.8? Sent by: gpfsug-discuss-bounces at spectrumscale.org So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest releases of everything, RedHat drops about 300 new updates, include a kernel update, and I find out that GPFS 4.2.3.8 has also escaped. :) Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel? (Official support matrix still says 3.10.0-693 is "latest tested") (See attached file: att7gkev.dat) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: att7gkev.dat Type: application/octet-stream Size: 497 bytes Desc: not available URL: From chair at spectrumscale.org Wed Apr 18 07:51:58 2018 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Wed, 18 Apr 2018 07:51:58 +0100 Subject: [gpfsug-discuss] UK Group Live Streams Message-ID: <1228FBE5-7050-443F-9514-446C28683711@spectrumscale.org> Hi All, We?re hoping to have live streaming of today and some of tomorrow?s sessions from London, I?ll post links to the streams on the Spectrum Scale User Group web-site as we go through the day. Note this is the first year we?ll have tried this, so we?ll have to see how it goes! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Apr 18 13:34:22 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 18 Apr 2018 12:34:22 +0000 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> Message-ID: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> While I don?t own a DeLorean I work with someone who once fixed one up, which I *think* effectively means I can jump back in time to before the deadline to submit. (And let?s be honest, with the way HPC is going it feels like we have the requisite 1.21GW of power...) However, since I can?t actually time travel back to last week, is there any possibility of an extension? On April 5, 2018 at 05:30:42 EDT, Simon Thompson (Spectrum Scale User Group Chair) wrote: Just a reminder that if you want to submit for the pilot RFE process, submissions must be in by end of next week. Judging by the responses so far, apparently the product is perfect ? Simon From: on behalf of "chair at spectrumscale.org" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 26 March 2018 at 12:52 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] RFE Process ... Burning Issues Hi All, We?ve been talking with product management about the RFE process and have agreed that we?ll try out a community-voting process. First up, we are piloting this idea, hopefully it will work out, but it may also need tweaks as we move forward. One of the things we?ve been asking for is for a better way for the Spectrum Scale user group community to vote on RFEs. Sure we get people posting to the list, but we?re looking at if we can make it a better/more formal process to support this. Talking with IBM, we also recognise that with a large number of RFEs, it can be difficult for them to track work tasks being completed, but with the community RFEs, there is a commitment to try and track them closely and report back on progress later in the year. To submit an RFE using this process, you must complete the form available at: https://ibm.box.com/v/EnhBlitz (Enhancement Blitz template v1.pptx) The form provides some guidance on a good and bad RFE. Sure a lot of us are techie/engineers, so please try to explain what problem you are solving rather than trying to provide a solution. (i.e. leave the technical implementation details to those with the source code). Each site is limited to 2 submissions and they will be looked over by the Spectrum Scale community leaders, we may ask people to merge requests, send back for more info etc, or there may be some that we know will just never be progressed for various reasons. At the April user group in the UK, we have an RFE (Burning issues) session planned. Submitters of the RFE will be expected to provide a 1-3 minute pitch for their RFE. We?ve placed the session at the end of the day (UK time) to try and ensure USA people can participate. Remote presentation of your RFE is fine and we plan to live-stream the session. Each person will have 3 votes to choose what they think are their highest priority requests. Again remote voting is perfectly fine but only 3 votes per person. The requests with the highest number of votes will then be given a higher chance of being implemented. There?s a possibility that some may even make the winter release cycle. Either way, we plan to track the ?chosen? RFEs more closely and provide an update at the November USA meeting (likely the SC18 one). The submission and voting process is also planned to be run again in time for the November meeting. Anyone wanting to submit an RFE for consideration should submit the form by email to rfe at spectrumscaleug.org *before* 13th April. We?ll be posting the submitted RFEs up at the box site as well, you are encouraged to visit the site regularly and check the submissions as you may want to contact the author of an RFE to provide more information/support the RFE. Anything received after this date will be held over to the November cycle. The earlier you submit, the better chance it has of being included (we plan to limit the number to be considered) and will give us time to review the RFE and come back for more information/clarification if needed. You must also be prepared to provide a 1-3 minute pitch for your RFE (in person or remote) for the UK user group meeting. You are welcome to submit any RFE you have already put into the RFE portal for this process to garner community votes for it. There is space on the form to provide the existing RFE number. If you have any comments on the process, you can also email them to rfe at spectrumscaleug.org as well. Thanks to Carl Zeite for supporting this plan? Get submitting! Simon (UK Group Chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 16:03:17 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 11:03:17 -0400 Subject: [gpfsug-discuss] RFE Process ... Burning Issues In-Reply-To: <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> References: <1E633CDB-606F-42B5-970E-83EF816F4894@spectrumscale.org> <709659D9-9244-4A81-9282-0FB7FB459D1A@nasa.gov> Message-ID: No, I think you'll have to find a working DeLorean, get in it and while traveling at 88 mph (141.622 kph) submit your email over an amateur packet radio network .... -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 16:54:45 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 11:54:45 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Message-ID: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From stockf at us.ibm.com Wed Apr 18 18:38:36 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 18 Apr 2018 13:38:36 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: Would the PATH_NAME LIKE option work? Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 12:55 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=tM9JZXsRNu6EEhoFlUuWvTLwMsqbDjfDj3NDZ6elACA&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo&s=V6u0XsNxHj4Mp-mu7hCZKv1AD3_GYqU-4KZzvMSQ_MQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Apr 18 19:00:13 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 14:00:13 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: I suggest you remove any FOR FILESET(...) specifications from your rules and then run mmapplypolicy /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan ... --scope inodespace -P your-policy-rules-file ... See also the (RTFineM) for the --scope option and the Directory argument of the mmapplypolicy command. That is the best, most efficient way to scan all the files that are in a particular inode-space. Also, you must have all filesets of interest "linked" and the file system must be mounted. Notice that "independent" means that the fileset name is used to denote both a fileset and an inode-space, where said inode-space contains the fileset of that name and possibly other "dependent" filesets... IF one wished to search the entire file system for files within several different filesets, one could use rules with FOR FILESET('fileset1','fileset2','and-so-on') Or even more flexibly WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' Or even more powerfully WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') From: "Jaime Pinto" To: "gpfsug main discussion list" Date: 04/18/2018 01:00 PM Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... Sent by: gpfsug-discuss-bounces at spectrumscale.org A few months ago I asked about limits and dynamics of traversing depended .vs independent filesets on this forum. I used the information provided to make decisions and setup our new DSS based gpfs storage system. Now I have a problem I couldn't' yet figure out how to make it work: 'project' and 'scratch' are top *independent* filesets of the same file system. 'proj1', 'proj2' are dependent filesets nested under 'project' 'scra1', 'scra2' are dependent filesets nested under 'scratch' I would like to run a purging policy on all contents under 'scratch' (which includes 'scra1', 'scra2'), and TSM backup policies on all contents under 'project' (which includes 'proj1', 'proj2'). HOWEVER: When I run the purging policy on the whole gpfs device (with both 'project' and 'scratch' filesets) * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 'scra2' filesets under scratch are excluded (totally unexpected) * if I use FOR FILESET('scra1') I get error that scra1 is dependent fileset (Ok, that is expected) * if I use /*FOR FILESET('scratch')*/, all contents under 'project', 'proj1', 'proj2' are traversed as well, and I don't want that (it takes too much time) * if I use /*FOR FILESET('scratch')*/, and instead of the whole device I apply the policy to the /scratch mount point only, the policy still traverses all the content of 'project', 'proj1', 'proj2', which I don't want. (again, totally unexpected) QUESTION: How can I craft the syntax of the mmapplypolicy in combination with the RULE filters, so that I can traverse all the contents under the 'scratch' independent fileset, including the nested dependent filesets 'scra1','scra2', and NOT traverse the other independent filesets at all (since this takes too much time)? Thanks Jaime PS: FOR FILESET('scra*') does not work. ************************************ TELL US ABOUT YOUR SUCCESS STORIES https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Apr 18 19:51:29 2018 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 18 Apr 2018 14:51:29 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> Message-ID: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Ok Marc and Frederick, there is hope. I'll conduct more experiments and report back Thanks for the suggestions. Jaime Quoting "Marc A Kaplan" : > I suggest you remove any FOR FILESET(...) specifications from your rules > and then run > > mmapplypolicy > /path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan > ... --scope inodespace -P your-policy-rules-file ... > > See also the (RTFineM) for the --scope option and the Directory argument > of the mmapplypolicy command. > > That is the best, most efficient way to scan all the files that are in a > particular inode-space. Also, you must have all filesets of interest > "linked" and the file system must be mounted. > > Notice that "independent" means that the fileset name is used to denote > both a fileset and an inode-space, where said inode-space contains the > fileset of that name and possibly other "dependent" filesets... > > IF one wished to search the entire file system for files within several > different filesets, one could use rules with > > FOR FILESET('fileset1','fileset2','and-so-on') > > Or even more flexibly > > WHERE FILESET_NAME LIKE 'sql-like-pattern-with-%s-and-maybe-_s' > > Or even more powerfully > > WHERE regex(FILESET_NAME, 'extended-regular-.*-expression') > > > > > > From: "Jaime Pinto" > To: "gpfsug main discussion list" > Date: 04/18/2018 01:00 PM > Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > A few months ago I asked about limits and dynamics of traversing > depended .vs independent filesets on this forum. I used the > information provided to make decisions and setup our new DSS based > gpfs storage system. Now I have a problem I couldn't' yet figure out > how to make it work: > > 'project' and 'scratch' are top *independent* filesets of the same > file system. > > 'proj1', 'proj2' are dependent filesets nested under 'project' > 'scra1', 'scra2' are dependent filesets nested under 'scratch' > > I would like to run a purging policy on all contents under 'scratch' > (which includes 'scra1', 'scra2'), and TSM backup policies on all > contents under 'project' (which includes 'proj1', 'proj2'). > > HOWEVER: > When I run the purging policy on the whole gpfs device (with both > 'project' and 'scratch' filesets) > > * if I use FOR FILESET('scratch') on the list rules, the 'scra1' and > 'scra2' filesets under scratch are excluded (totally unexpected) > > * if I use FOR FILESET('scra1') I get error that scra1 is dependent > fileset (Ok, that is expected) > > * if I use /*FOR FILESET('scratch')*/, all contents under 'project', > 'proj1', 'proj2' are traversed as well, and I don't want that (it > takes too much time) > > * if I use /*FOR FILESET('scratch')*/, and instead of the whole device > I apply the policy to the /scratch mount point only, the policy still > traverses all the content of 'project', 'proj1', 'proj2', which I > don't want. (again, totally unexpected) > > QUESTION: > > How can I craft the syntax of the mmapplypolicy in combination with > the RULE filters, so that I can traverse all the contents under the > 'scratch' independent fileset, including the nested dependent filesets > 'scra1','scra2', and NOT traverse the other independent filesets at > all (since this takes too much time)? > > Thanks > Jaime > > > PS: FOR FILESET('scra*') does not work. > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk&e= > > ************************************ > --- > Jaime Pinto - Storage Analyst > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.ca > University of Toronto > 661 University Ave. (MaRS), Suite 1140 > Toronto, ON, M5G1M1 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE&s=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk&e= > > > > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From makaplan at us.ibm.com Wed Apr 18 22:22:22 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 18 Apr 2018 17:22:22 -0400 Subject: [gpfsug-discuss] mmapplypolicy on nested filesets ... In-Reply-To: <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> References: <20180418115445.8603670sy6ee6fk5@support.scinet.utoronto.ca> <20180418145129.63803tvsotr1960h@support.scinet.utoronto.ca> Message-ID: It's more than hope. It works just as I wrote and documented and tested. Personally, I find the nomenclature for filesets and inodespaces as "independent filesets" unfortunate and leading to misunderstandings and confusion. But that train left the station a few years ago, so we just live with it... -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dirk.Thometzek at rohde-schwarz.com Thu Apr 19 08:44:00 2018 From: Dirk.Thometzek at rohde-schwarz.com (Dirk Thometzek) Date: Thu, 19 Apr 2018 07:44:00 +0000 Subject: [gpfsug-discuss] Career Opportunity Message-ID: <79f9ca3347214203b37003ac5dc288c7@rohde-schwarz.com> Dear all, I am working with a development team located in Hanover, Germany. Currently we are looking for a Spectrum Scale professional with long term experience to support our team in a senior development position. If you are interested, please send me a private message to: dirk.thometzek at rohde-schwarz.com Best regards, Dirk Thometzek Product Management File Based Media Solutions [RS_Logo_cyan_rgb - Klein] Rohde & Schwarz GmbH & Co. KG Pf. 80 14 69, D-81614 Muenchen Abt. MU Phone: +49 511 67807-0 Gesch?ftsf?hrung / Executive Board: Christian Leicher (Vorsitzender / Chairman), Peter Riedel Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRA 16 270 Pers?nlich haftender Gesellschafter / Personally Liable Partner: RUSEG Verwaltungs-GmbH | Sitz der Gesellschaft / Company's Place of Business: M?nchen | Registereintrag / Commercial Register No.: HRB 7 534 | Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: DE 130 256 683 | Elektro-Altger?te Register (EAR) / WEEE Register No.: DE 240 437 86 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 6729 bytes Desc: image001.jpg URL: From delmard at br.ibm.com Thu Apr 19 14:37:14 2018 From: delmard at br.ibm.com (Delmar Demarchi) Date: Thu, 19 Apr 2018 11:37:14 -0200 Subject: [gpfsug-discuss] API - listing quotas Message-ID: Hello Experts. I'm trying to collect information from Fileset Quotas, using the API. I'm using this link as reference: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_apiv2version2.htm Do you know about any issue with Scale 5.0.x and API? Or what I have change in my command to collect this infos? Following the instruction on knowledge Center, we tried to list, using GET, the FILESET Quota but only USR and GRP were reported. Listing all quotas (using GET also), I found my quota there. See my sample: curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/filesets/sicredi/quotas " { "quotas" : [ { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 454, "quotaType" : "GRP" }, { "blockGrace" : "none", "blockLimit" : 0, "blockQuota" : 0, "filesGrace" : "none", "filesLimit" : 0, "filesQuota" : 0, "filesetName" : "sicredi", "filesystemName" : "fs1", "isDefaultQuota" : true, "objectId" : 0, "quotaId" : 501, "quotaType" : "USR" } ], "status" : { "code" : 200, "message" : "The request finished successfully." } }[root at lbsgpfs05 ~]# curl -k -u admin:passw0rd -XGET -H content-type:application/json " https://xx.xx.xx.xx:443/scalemgmt/v2/filesystems/fs1/quotas" { "quotas" : [ { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 0, "blockQuota" : 0, "blockUsage" : 512, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 1, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 0, "objectName" : "root", "quotaId" : 366, "quotaType" : "FILESET" }, { "blockGrace" : "none", "blockInDoubt" : 0, "blockLimit" : 6598656, "blockQuota" : 6598656, "blockUsage" : 5670208, "filesGrace" : "none", "filesInDoubt" : 0, "filesLimit" : 0, "filesQuota" : 0, "filesUsage" : 5, "filesystemName" : "fs1", "isDefaultQuota" : false, "objectId" : 1, "objectName" : "sicredi", "quotaId" : 367, "quotaType" : "FILESET" } "status" : { "code" : 200, "message" : "The request finished successfully." } } mmlsquota -j sicredi fs1 --block-size auto Block Limits | File Limits Filesystem type blocks quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs1 FILESET 5.408G 6.293G 6.293G 0 none | 5 0 0 0 none mmrepquota -a *** Report for USR GRP FILESET quotas on fs1 Block Limits | File Limits Name fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace entryType sicredi root FILESET 5670208 6598656 6598656 0 none | 5 0 0 0 none e Regards, | Abrazos, | Atenciosamente, Delmar Demarchi .'. Power and Storage Services Specialist Phone: 55-19-2132-9469 | Mobile: 55-19-9 9792-1323 E-mail: delmard at br.ibm.com www.ibm.com/systems/services/labservices -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 2022 bytes Desc: not available URL: From A.Wolf-Reber at de.ibm.com Thu Apr 19 14:56:24 2018 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Thu, 19 Apr 2018 13:56:24 +0000 Subject: [gpfsug-discuss] API - listing quotas In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729038.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729039.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.152411877729040.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._2_37157E80371579900049F11E83258274.jpg Type: image/jpeg Size: 6614 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_371596C0371591D00049F11E83258274.gif Type: image/gif Size: 2022 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Fri Apr 20 15:01:55 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Fri, 20 Apr 2018 14:01:55 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Message-ID: Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Apr 20 15:12:11 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Fri, 20 Apr 2018 14:12:11 +0000 Subject: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale In-Reply-To: References: Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6@bham.ac.uk> Sorry, it was a typo from my side. The talks that are missing we are chasing for copies of the slides that we can release. Simon From: on behalf of "Renar.Grunenberg at huk-coburg.de" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Friday, 20 April 2018 at 15:02 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale Hallo Simon, are there any reason why the link of the presentation from Yong ZY Zheng(Cognitive, ML, Hortonworks) is not linked. Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Sun Apr 22 13:18:23 2018 From: nick.savva at adventone.com (Nick Savva) Date: Sun, 22 Apr 2018 12:18:23 +0000 Subject: [gpfsug-discuss] AFM cache re-link Message-ID: Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB's across the link that are already there. Appreciate the help in advance, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From sannaik2 at in.ibm.com Sun Apr 22 17:01:14 2018 From: sannaik2 at in.ibm.com (Sandeep Naik1) Date: Sun, 22 Apr 2018 21:31:14 +0530 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz> Message-ID: Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.khiredine at meteo.dz Sun Apr 22 18:07:01 2018 From: a.khiredine at meteo.dz (atmane khiredine) Date: Sun, 22 Apr 2018 17:07:01 +0000 Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location In-Reply-To: References: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0B94@SDEB-EXC02.meteo.dz>, Message-ID: <4B32CB5C696F2849BDEF7DF9EACE884B85DD0D0D@SDEB-EXC02.meteo.dz> Hi sannaik after doing some research in the Cluster I find the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 I use the old drive of pdisk e2d3s11 to pdisk e3d3s06 now the pdisk e3d3s06 is in the wrong location SV30708502-3-11 ---- mmlspdisk e3d3s06 name = "e3d3s06" device = "/dev/sdfa,/dev/sdir" location = "SV30708502-3-11" userLocation = "Rack ess1 U11-14, Enclosure 1818-80E-SV30708502 Drawer 3 Slot 11" ---- and the pdisk e2d3s11 is without location mmlspdisk e2d3s11 name = "e2d3s11" device = " " location = "" userLocation = "" --- if i use the script replace-at-location for e3d3s06 SV25304899-3-6 replace-at-location BB1RGL e3d3s06 SV25304899-3-6 replace-at-location: error: pdisk e3d3s06 of RG BB1RGL is in location SV30708502-3-11, not SV25304899-3-6. Check the pdisk name and location code before continuing. if i use the script replace-at-location for e3d3s06 SV30708502-3-11 replace-at-location BB1RGL e3d3s06 SV30708502-3-11 location SV30708502-3-11 has a location if i use replace-at-location BB1RGR e2d3s11 SV30708502-3-11 Disk descriptor for /dev/sdfc,/dev/sdiq refers to an existing pdisk. the pdisk e3d3s06 is must be in Enclosure 3 Drawer 3 slot 6 but now is in the Enclosure 2 Drawer 3 Slot 11 is supose to be for e2d3s11 the disk found in location SV30708502-3-11 is not a blank disk because is a lready used by e3d3s06 why e3d3s06 is take the place of e2d3s11 and is stil working Thanks Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz ________________________________________ De : Sandeep Naik1 [sannaik2 at in.ibm.com] Envoy? : dimanche 22 avril 2018 17:01 ? : atmane khiredine Cc : gpfsug main discussion list Objet : Re: [gpfsug-discuss] GNR/GSS/ESS pdisk location Hi Atmane, Can you include the o/p of command tslsenclslot -a from both the nodes ? Any thing in the logs related to this pdisk ? Thanks, Sandeep Naik Elastic Storage server / GPFS Test ETZ-B, Hinjewadi Pune India (+91) 8600994314 From: atmane khiredine To: "gpfsug-discuss at spectrumscale.org" Date: 17/04/2018 02:09 PM Subject: [gpfsug-discuss] GNR/GSS/ESS pdisk location Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ dear all, I want to understand how GNR/GSS/ESS stores information about the pdisk location I looked in the configuration file /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmfs.cfg but no location of pdisk this is real scenario of unknown location this is the output from GNR/GSS/ESS ----------------------------------- [root at ess1 ~]# mmlspdisk BB1RGR --not-ok pdisk: replacementPriority = 2.00 name = "e2d3s11" device = "" recoveryGroup = "BB1RGR" declusteredArray = "DA2" state = "missing/noPath/systemDrain/noRGD/noVCD/noData" capacity = 3000034656256 freeSpace = 2997887172608 location = "" WWN = "naa.5000C50056717727" server = "ess1-ib0" reads = 106800946 writes = 10414075 IOErrors = 1216 IOTimeouts = 18 mediaErrors = 0 checksumErrors = 0 pathErrors = 0 relativePerformance = 1.000 userLocation = "" userCondition = "replaceable" hardware = " " hardwareType = Rotating 7200 nPaths = 0 active 0 total nsdFormatVersion = Unknown paxosAreaOffset = Unknown paxosAreaSize = Unknown logicalBlockSize = 512 ----------------------------------- I begin change the Hard disk mmchcarrier BB1RGR --release --pdisk "e2d3s11" I have this error Location of pdisk e2d3s11 of recovery group BB1RGR is not known. i know the location of the Hard disk i know the location of the Hard disk from old mmlspdisk file mmchcarrier BB1RGR --release --pdisk "e2d3s11" --location "SV30708502-3-11" Location of pdisk e2d3s11 of recovery group BB1RGR is not known. I read in the official documentation 6027-3001 [E] Location of pdisk pdiskName of recovery group recoveryGroupName is not known. Explanation: IBM Spectrum Scale is unable to find the location of the given pdisk. User response: Check the disk enclosure hardware. Atmane Khiredine HPC System Administrator | Office National de la M?t?orologie T?l : +213 21 50 73 93 # 303 | Fax : +213 21 50 79 40 | E-mail : a.khiredine at meteo.dz _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=DXkezTwrVXsEOfvoqY7_DLS86P5FtQszjm9zok6upRU&m=dtQxM0x58-X-aWHl-3gNSQq_YWWdIMi_GcStOMr9Tt0&s=SJIGLOxE4hu-R8p5at9i6BvxDkyPQn4J6LiJjaQE180&e= From coetzee.ray at gmail.com Sun Apr 22 23:38:41 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Sun, 22 Apr 2018 23:38:41 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Good evening all I'm working with IBM on a PMR where ganesha is segfaulting or causing kernel panics on one group of CES nodes. We have 12 identical CES nodes split into two groups of 6 nodes each & have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. Only one group started having issues Monday morning where ganesha would segfault and the mounts would move over to the remaining nodes. The remaining nodes then start to fall over like dominos within minutes or hours to the point that all CES nodes are "failed" according to "mmces node list" and the VIP's are unassigned. Recovering the nodes are extremely finicky and works for a few minutes or hours before segfaulting again. Most times a complete stop of Ganesha on all nodes & then only starting it on two random nodes allow mounts to recover for a while. None of the following has helped: A reboot of all nodes. Refresh CCR config file with mmsdrrestore Remove/add CES from nodes. Reinstall GPFS & protocol rpms Update to 5.0.0-2 Fresh reinstall of a node Network checks out with no dropped packets on either data or export networks. The only temporary fix so far has been to downrev ganesha to 2.3.2 from 2.5.3 on the affected nodes. While waiting for IBM development, has anyone seen something similar maybe? Kind regards Ray Coetzee On Sat, Apr 21, 2018 at 12:00 PM, wrote: > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) > 2. Re: UK Meeting - tooling Spectrum Scale > (Simon Thompson (IT Research Support)) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 20 Apr 2018 14:01:55 +0000 > From: "Grunenberg, Renar" > To: "'gpfsug-discuss at spectrumscale.org'" > > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: > Content-Type: text/plain; charset="utf-8" > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/91e3d84d/attachment-0001.html> > > ------------------------------ > > Message: 2 > Date: Fri, 20 Apr 2018 14:12:11 +0000 > From: "Simon Thompson (IT Research Support)" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> > Content-Type: text/plain; charset="utf-8" > > Sorry, it was a typo from my side. > > The talks that are missing we are chasing for copies of the slides that we > can release. > > Simon > > From: on behalf of " > Renar.Grunenberg at huk-coburg.de" > Reply-To: "gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > Date: Friday, 20 April 2018 at 15:02 > To: "gpfsug-discuss at spectrumscale.org" > Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale > > Hallo Simon, > are there any reason why the link of the presentation from Yong ZY > Zheng(Cognitive, ML, Hortonworks) is not linked. > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: > > 09561 96-44110 > > Telefax: > > 09561 96-44104 > > E-Mail: > > Renar.Grunenberg at huk-coburg.de > > Internet: > > www.huk.de > > ________________________________ > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter > Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav > Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ________________________________ > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte > Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich > erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese > Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht > ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information > in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in > this information is strictly forbidden. > ________________________________ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: 20180420/0b8e9ffa/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 75, Issue 34 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 00:02:09 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 01:02:09 +0200 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Yes, I've been struggelig with something similiar this week. Ganesha dying with SIGABRT -- nothing else logged. After catching a few coredumps, it has been identified as a problem with some udp-communication during mounts from solaris clients. Disabling udp as transport on the shares serverside didn't help. It was suggested to use "mount -o tcp" or whatever the solaris version of this is -- but we haven't tested this. So far the downgrade to v2.3.2 has been our workaround. PMR: 48669,080,678 -jf On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee wrote: > Good evening all > > I'm working with IBM on a PMR where ganesha is segfaulting or causing > kernel panics on one group of CES nodes. > > We have 12 identical CES nodes split into two groups of 6 nodes each & > have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was released. > > Only one group started having issues Monday morning where ganesha would > segfault and the mounts would move over to the remaining nodes. > The remaining nodes then start to fall over like dominos within minutes or > hours to the point that all CES nodes are "failed" according to "mmces node > list" and the VIP's are unassigned. > > Recovering the nodes are extremely finicky and works for a few minutes or > hours before segfaulting again. > Most times a complete stop of Ganesha on all nodes & then only starting it > on two random nodes allow mounts to recover for a while. > > None of the following has helped: > A reboot of all nodes. > Refresh CCR config file with mmsdrrestore > Remove/add CES from nodes. > Reinstall GPFS & protocol rpms > Update to 5.0.0-2 > Fresh reinstall of a node > Network checks out with no dropped packets on either data or export > networks. > > The only temporary fix so far has been to downrev ganesha to 2.3.2 from > 2.5.3 on the affected nodes. > > While waiting for IBM development, has anyone seen something similar maybe? > > Kind regards > > Ray Coetzee > > > > On Sat, Apr 21, 2018 at 12:00 PM, spectrumscale.org> wrote: > >> Send gpfsug-discuss mailing list submissions to >> gpfsug-discuss at spectrumscale.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> or, via email, send a message with subject or body 'help' to >> gpfsug-discuss-request at spectrumscale.org >> >> You can reach the person managing the list at >> gpfsug-discuss-owner at spectrumscale.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of gpfsug-discuss digest..." >> >> >> Today's Topics: >> >> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >> 2. Re: UK Meeting - tooling Spectrum Scale >> (Simon Thompson (IT Research Support)) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Fri, 20 Apr 2018 14:01:55 +0000 >> From: "Grunenberg, Renar" >> To: "'gpfsug-discuss at spectrumscale.org'" >> >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: >> Content-Type: text/plain; charset="utf-8" >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: 09561 96-44110 >> Telefax: 09561 96-44104 >> E-Mail: Renar.Grunenberg at huk-coburg.de >> Internet: www.huk.de >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/91e3d84d/attachment-0001.html> >> >> ------------------------------ >> >> Message: 2 >> Date: Fri, 20 Apr 2018 14:12:11 +0000 >> From: "Simon Thompson (IT Research Support)" >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >> Content-Type: text/plain; charset="utf-8" >> >> Sorry, it was a typo from my side. >> >> The talks that are missing we are chasing for copies of the slides that >> we can release. >> >> Simon >> >> From: on behalf of " >> Renar.Grunenberg at huk-coburg.de" >> Reply-To: "gpfsug-discuss at spectrumscale.org" < >> gpfsug-discuss at spectrumscale.org> >> Date: Friday, 20 April 2018 at 15:02 >> To: "gpfsug-discuss at spectrumscale.org" >> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >> >> Hallo Simon, >> are there any reason why the link of the presentation from Yong ZY >> Zheng(Cognitive, ML, Hortonworks) is not linked. >> >> Renar Grunenberg >> Abteilung Informatik ? Betrieb >> >> HUK-COBURG >> Bahnhofsplatz >> 96444 Coburg >> Telefon: >> >> 09561 96-44110 >> >> Telefax: >> >> 09561 96-44104 >> >> E-Mail: >> >> Renar.Grunenberg at huk-coburg.de >> >> Internet: >> >> www.huk.de >> >> ________________________________ >> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >> Deutschlands a. G. in Coburg >> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >> ________________________________ >> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >> Informationen. >> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich >> erhalten haben, >> informieren Sie bitte sofort den Absender und vernichten Sie diese >> Nachricht. >> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >> ist nicht gestattet. >> >> This information may contain confidential and/or privileged information. >> If you are not the intended recipient (or have received this information >> in error) please notify the >> sender immediately and destroy this information. >> Any unauthorized copying, disclosure or distribution of the material in >> this information is strictly forbidden. >> ________________________________ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > 0420/0b8e9ffa/attachment-0001.html> >> >> ------------------------------ >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> End of gpfsug-discuss Digest, Vol 75, Issue 34 >> ********************************************** >> > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coetzee.ray at gmail.com Mon Apr 23 00:23:55 2018 From: coetzee.ray at gmail.com (Ray Coetzee) Date: Mon, 23 Apr 2018 00:23:55 +0100 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: Hi Jan-Frode We've been told the same regarding mounts using UDP. Our exports are already explicitly configured for TCP and the client's fstab's set to use TCP. It would be infuriating if the clients are trying UDP first irrespective of the mount options configured. Why the problem started specifically last week for both of us is interesting. Kind regards Ray Coetzee Mob: +44 759 704 7060 Skype: ray.coetzee Email: coetzee.ray at gmail.com On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust wrote: > > Yes, I've been struggelig with something similiar this week. Ganesha dying > with SIGABRT -- nothing else logged. After catching a few coredumps, it has > been identified as a problem with some udp-communication during mounts from > solaris clients. Disabling udp as transport on the shares serverside didn't > help. It was suggested to use "mount -o tcp" or whatever the solaris > version of this is -- but we haven't tested this. So far the downgrade to > v2.3.2 has been our workaround. > > PMR: 48669,080,678 > > > -jf > > > On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee > wrote: > >> Good evening all >> >> I'm working with IBM on a PMR where ganesha is segfaulting or causing >> kernel panics on one group of CES nodes. >> >> We have 12 identical CES nodes split into two groups of 6 nodes each & >> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >> released. >> >> Only one group started having issues Monday morning where ganesha would >> segfault and the mounts would move over to the remaining nodes. >> The remaining nodes then start to fall over like dominos within minutes >> or hours to the point that all CES nodes are "failed" according to >> "mmces node list" and the VIP's are unassigned. >> >> Recovering the nodes are extremely finicky and works for a few minutes or >> hours before segfaulting again. >> Most times a complete stop of Ganesha on all nodes & then only starting >> it on two random nodes allow mounts to recover for a while. >> >> None of the following has helped: >> A reboot of all nodes. >> Refresh CCR config file with mmsdrrestore >> Remove/add CES from nodes. >> Reinstall GPFS & protocol rpms >> Update to 5.0.0-2 >> Fresh reinstall of a node >> Network checks out with no dropped packets on either data or export >> networks. >> >> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >> 2.5.3 on the affected nodes. >> >> While waiting for IBM development, has anyone seen something similar >> maybe? >> >> Kind regards >> >> Ray Coetzee >> >> >> >> On Sat, Apr 21, 2018 at 12:00 PM, > umscale.org> wrote: >> >>> Send gpfsug-discuss mailing list submissions to >>> gpfsug-discuss at spectrumscale.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> or, via email, send a message with subject or body 'help' to >>> gpfsug-discuss-request at spectrumscale.org >>> >>> You can reach the person managing the list at >>> gpfsug-discuss-owner at spectrumscale.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of gpfsug-discuss digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>> 2. Re: UK Meeting - tooling Spectrum Scale >>> (Simon Thompson (IT Research Support)) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>> From: "Grunenberg, Renar" >>> To: "'gpfsug-discuss at spectrumscale.org'" >>> >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: >>> Content-Type: text/plain; charset="utf-8" >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: 09561 96-44110 >>> Telefax: 09561 96-44104 >>> E-Mail: Renar.Grunenberg at huk-coburg.de >>> Internet: www.huk.de >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/91e3d84d/attachment-0001.html> >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>> From: "Simon Thompson (IT Research Support)" >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>> Content-Type: text/plain; charset="utf-8" >>> >>> Sorry, it was a typo from my side. >>> >>> The talks that are missing we are chasing for copies of the slides that >>> we can release. >>> >>> Simon >>> >>> From: on behalf of " >>> Renar.Grunenberg at huk-coburg.de" >>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>> gpfsug-discuss at spectrumscale.org> >>> Date: Friday, 20 April 2018 at 15:02 >>> To: "gpfsug-discuss at spectrumscale.org" >> > >>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>> >>> Hallo Simon, >>> are there any reason why the link of the presentation from Yong ZY >>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>> >>> Renar Grunenberg >>> Abteilung Informatik ? Betrieb >>> >>> HUK-COBURG >>> Bahnhofsplatz >>> 96444 Coburg >>> Telefon: >>> >>> 09561 96-44110 >>> >>> Telefax: >>> >>> 09561 96-44104 >>> >>> E-Mail: >>> >>> Renar.Grunenberg at huk-coburg.de >>> >>> Internet: >>> >>> www.huk.de >>> >>> ________________________________ >>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>> Deutschlands a. G. in Coburg >>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>> ________________________________ >>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>> Informationen. >>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>> irrt?mlich erhalten haben, >>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>> Nachricht. >>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>> ist nicht gestattet. >>> >>> This information may contain confidential and/or privileged information. >>> If you are not the intended recipient (or have received this information >>> in error) please notify the >>> sender immediately and destroy this information. >>> Any unauthorized copying, disclosure or distribution of the material in >>> this information is strictly forbidden. >>> ________________________________ >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> 0420/0b8e9ffa/attachment-0001.html> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>> ********************************************** >>> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Mon Apr 23 06:00:26 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 23 Apr 2018 05:00:26 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34 In-Reply-To: References: Message-ID: It started for me after upgrade from v4.2.x.x to 5.0.0.1 with RHEL7.4. Strangely not immediately, but 2 days after the upgrade (wednesday evening CET). Also I have some doubts that mount -o tcp will help, since TCP should already be the default transport. Have asked for if we can rather block this serverside using iptables. But, I expect we should get a fix soon, and we?ll stick with v2.3.2 until that. -jf man. 23. apr. 2018 kl. 01:23 skrev Ray Coetzee : > Hi Jan-Frode > We've been told the same regarding mounts using UDP. > Our exports are already explicitly configured for TCP and the client's > fstab's set to use TCP. > It would be infuriating if the clients are trying UDP first irrespective > of the mount options configured. > > Why the problem started specifically last week for both of us is > interesting. > > Kind regards > > Ray Coetzee > Mob: +44 759 704 7060 > > Skype: ray.coetzee > > Email: coetzee.ray at gmail.com > > > On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust > wrote: > >> >> Yes, I've been struggelig with something similiar this week. Ganesha >> dying with SIGABRT -- nothing else logged. After catching a few coredumps, >> it has been identified as a problem with some udp-communication during >> mounts from solaris clients. Disabling udp as transport on the shares >> serverside didn't help. It was suggested to use "mount -o tcp" or whatever >> the solaris version of this is -- but we haven't tested this. So far the >> downgrade to v2.3.2 has been our workaround. >> >> PMR: 48669,080,678 >> >> >> -jf >> >> >> On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee >> wrote: >> >>> Good evening all >>> >>> I'm working with IBM on a PMR where ganesha is segfaulting or causing >>> kernel panics on one group of CES nodes. >>> >>> We have 12 identical CES nodes split into two groups of 6 nodes each & >>> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was >>> released. >>> >>> Only one group started having issues Monday morning where ganesha would >>> segfault and the mounts would move over to the remaining nodes. >>> The remaining nodes then start to fall over like dominos within minutes >>> or hours to the point that all CES nodes are "failed" according to >>> "mmces node list" and the VIP's are unassigned. >>> >>> Recovering the nodes are extremely finicky and works for a few minutes >>> or hours before segfaulting again. >>> Most times a complete stop of Ganesha on all nodes & then only starting >>> it on two random nodes allow mounts to recover for a while. >>> >>> None of the following has helped: >>> A reboot of all nodes. >>> Refresh CCR config file with mmsdrrestore >>> Remove/add CES from nodes. >>> Reinstall GPFS & protocol rpms >>> Update to 5.0.0-2 >>> Fresh reinstall of a node >>> Network checks out with no dropped packets on either data or export >>> networks. >>> >>> The only temporary fix so far has been to downrev ganesha to 2.3.2 from >>> 2.5.3 on the affected nodes. >>> >>> While waiting for IBM development, has anyone seen something similar >>> maybe? >>> >>> Kind regards >>> >>> Ray Coetzee >>> >>> >>> >>> On Sat, Apr 21, 2018 at 12:00 PM, < >>> gpfsug-discuss-request at spectrumscale.org> wrote: >>> >>>> Send gpfsug-discuss mailing list submissions to >>>> gpfsug-discuss at spectrumscale.org >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> or, via email, send a message with subject or body 'help' to >>>> gpfsug-discuss-request at spectrumscale.org >>>> >>>> You can reach the person managing the list at >>>> gpfsug-discuss-owner at spectrumscale.org >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of gpfsug-discuss digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> 1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar) >>>> 2. Re: UK Meeting - tooling Spectrum Scale >>>> (Simon Thompson (IT Research Support)) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Fri, 20 Apr 2018 14:01:55 +0000 >>>> From: "Grunenberg, Renar" >>>> To: "'gpfsug-discuss at spectrumscale.org'" >>>> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: 09561 96-44110 >>>> Telefax: 09561 96-44104 >>>> E-Mail: Renar.Grunenberg at huk-coburg.de >>>> Internet: www.huk.de >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/91e3d84d/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Fri, 20 Apr 2018 14:12:11 +0000 >>>> From: "Simon Thompson (IT Research Support)" >>>> To: gpfsug main discussion list >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk> >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Sorry, it was a typo from my side. >>>> >>>> The talks that are missing we are chasing for copies of the slides that >>>> we can release. >>>> >>>> Simon >>>> >>>> From: on behalf of " >>>> Renar.Grunenberg at huk-coburg.de" >>>> Reply-To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Date: Friday, 20 April 2018 at 15:02 >>>> To: "gpfsug-discuss at spectrumscale.org" < >>>> gpfsug-discuss at spectrumscale.org> >>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale >>>> >>>> Hallo Simon, >>>> are there any reason why the link of the presentation from Yong ZY >>>> Zheng(Cognitive, ML, Hortonworks) is not linked. >>>> >>>> Renar Grunenberg >>>> Abteilung Informatik ? Betrieb >>>> >>>> HUK-COBURG >>>> Bahnhofsplatz >>>> 96444 Coburg >>>> Telefon: >>>> >>>> 09561 96-44110 >>>> >>>> Telefax: >>>> >>>> 09561 96-44104 >>>> >>>> E-Mail: >>>> >>>> Renar.Grunenberg at huk-coburg.de >>>> >>>> Internet: >>>> >>>> www.huk.de >>>> >>>> ________________________________ >>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter >>>> Deutschlands a. G. in Coburg >>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 >>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg >>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. >>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans >>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. >>>> ________________________________ >>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte >>>> Informationen. >>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht >>>> irrt?mlich erhalten haben, >>>> informieren Sie bitte sofort den Absender und vernichten Sie diese >>>> Nachricht. >>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht >>>> ist nicht gestattet. >>>> >>>> This information may contain confidential and/or privileged information. >>>> If you are not the intended recipient (or have received this >>>> information in error) please notify the >>>> sender immediately and destroy this information. >>>> Any unauthorized copying, disclosure or distribution of the material in >>>> this information is strictly forbidden. >>>> ________________________________ >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: < >>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/0b8e9ffa/attachment-0001.html >>>> > >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> End of gpfsug-discuss Digest, Vol 75, Issue 34 >>>> ********************************************** >>>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Apr 23 11:56:19 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 23 Apr 2018 16:26:19 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 15:10:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 14:10:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback Message-ID: Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 23 16:08:14 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 23 Apr 2018 15:08:14 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of "luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Mon Apr 23 20:54:41 2018 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Mon, 23 Apr 2018 19:54:41 +0000 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: > My very unconsidered and unsupported suggestion would be to edit mmfsfuncs > on your test cluster and see if it?s actually implemented further in the > code ? > > > > Simon > > > > *From: * on behalf of " > luke.raimbach at googlemail.com" > *Reply-To: *"gpfsug-discuss at spectrumscale.org" < > gpfsug-discuss at spectrumscale.org> > *Date: *Monday, 23 April 2018 at 15:11 > *To: *"gpfsug-discuss at spectrumscale.org" > > *Subject: *[gpfsug-discuss] afmPrepopEnd Callback > > > > Good Afternoon AFM Experts, > > > > I looked in the manual for afmPreopopEnd event variables I can extract to > log something useful after a prefetch event completes. Here is the manual > entry: > > > > %prepopAlreadyCachedFiles > > Specifies the number of files that are cached. > > These number of files are not read into cache > > because data is same between cache and home. > > > > However, when I try to install a callback like this, I get the associated > error: > > > > # mmaddcallback afmCompletionReport --command > /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName > %filesetName %prepopCompletedReads %prepopFailedReads > %prepopAlreadyCachedFiles %prepopData" > > mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was > specified. > > mmaddcallback: Command failed. Examine previous error messages to > determine cause. > > > > I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three > %prepop variables listed: > > > > %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; > > %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; > > %prepopdata ) validCallbackVariable="%prepopData";; > > > > Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? > > > > Unusual to see the manual ahead of the code ;) > > > > Cheers, > > Luke > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nick.savva at adventone.com Tue Apr 24 07:47:54 2018 From: nick.savva at adventone.com (Nick Savva) Date: Tue, 24 Apr 2018 06:47:54 +0000 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:38:17 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:08:17 +0530 Subject: [gpfsug-discuss] afmPrepopEnd Callback In-Reply-To: References: Message-ID: Hi Luke, This issue has been fixed now. You could either request efix or try workaround as suggested by Simon. The following parameters are supported. prepopCompletedReads prepopFailedReads prepopData This one is missing from the mmfsfuncs and is fixed now. prepopAlreadyCachedFiles ~Venkat (vpuvvada at in.ibm.com) From: Luke Raimbach To: gpfsug main discussion list Date: 04/24/2018 01:25 AM Subject: Re: [gpfsug-discuss] afmPrepopEnd Callback Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Simon, Thanks for the consideration. It's a little difficult, though, to give such a flannel answer to a customer, when the manual says one thing and then the supporting code doesn't exist. I had walked through how the callback might might be constructed with the customer and then put together a simple demo script to help them program things in the future. Slightly red faced when I got rejected by the terminal! Can someone from IBM say which callback parameters are actually valid and supported? I'm programming against 4.2.3.8 in this instance. Cheers, Luke. On Mon, 23 Apr 2018, 17:08 Simon Thompson (IT Research Support), < S.J.Thompson at bham.ac.uk> wrote: My very unconsidered and unsupported suggestion would be to edit mmfsfuncs on your test cluster and see if it?s actually implemented further in the code ? Simon From: on behalf of " luke.raimbach at googlemail.com" Reply-To: "gpfsug-discuss at spectrumscale.org" < gpfsug-discuss at spectrumscale.org> Date: Monday, 23 April 2018 at 15:11 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] afmPrepopEnd Callback Good Afternoon AFM Experts, I looked in the manual for afmPreopopEnd event variables I can extract to log something useful after a prefetch event completes. Here is the manual entry: %prepopAlreadyCachedFiles Specifies the number of files that are cached. These number of files are not read into cache because data is same between cache and home. However, when I try to install a callback like this, I get the associated error: # mmaddcallback afmCompletionReport --command /var/mmfs/etc/afmPrepopEnd.sh --event afmPrepopEnd -N afm --parms "%fsName %filesetName %prepopCompletedReads %prepopFailedReads %prepopAlreadyCachedFiles %prepopData" mmaddcallback: Invalid callback variable "%prepopAlreadyCachedFiles" was specified. mmaddcallback: Command failed. Examine previous error messages to determine cause. I have a butcher's in /usr/lpp/mmfs/bin/mmfsfuncs and see only these three %prepop variables listed: %prepopcompletedreads ) validCallbackVariable="%prepopCompletedReads";; %prepopfailedreads ) validCallbackVariable="%prepopFailedReads";; %prepopdata ) validCallbackVariable="%prepopData";; Is the %prepopAlreadyCachedFiles not implemented? Will it be implemented? Unusual to see the manual ahead of the code ;) Cheers, Luke _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=CKY14hxZ-5Ur87lPVdFwwcpuP1lfw-0_vyYhZCcf1pk&s=C058esOcmGSwBjnUblCLIJEpF4CKsXAos0Ap57R6A4Q&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Tue Apr 24 08:42:25 2018 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 24 Apr 2018 13:12:25 +0530 Subject: [gpfsug-discuss] AFM cache re-link In-Reply-To: References: Message-ID: RO cache filesets doesn't support failover command. Is NICKTESTFSET RO mode fileset ? >The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. mmafmctl failover/resync commands does not remove extra files at home, if home is empty this won't be an issue. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: gpfsug main discussion list Date: 04/24/2018 12:18 PM Subject: Re: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org The caches are RO. Thanks that?s exactly what I tested, its just the infocenter threw me when it said it expects the home to be empty?.. This was the command I used mmafmctl cachefs1 failover -j NICKTESTFSET --new-target nfs://10.0.0.142/ibm/scalefs2/fsettest Appreciate the confirmation Nick From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Venkateswara R Puvvada Sent: Monday, 23 April 2018 8:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM cache re-link What is the fileset mode ? AFM won't attempt to copy the data back to home if file data already exists (checks if file size, mtime with nano seconds granularity and number of data blocks allocated are same). For example rsync version >= 3.1.0 keeps file mtime in sync with nano seconds granularity. Copy the data from old home to new home and run failover command from cache to avoid resynching the entire data. ~Venkat (vpuvvada at in.ibm.com) From: Nick Savva To: "'gpfsug-discuss at spectrumscale.org'" < gpfsug-discuss at spectrumscale.org> Date: 04/22/2018 05:48 PM Subject: [gpfsug-discuss] AFM cache re-link Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, I was always preface my questions with an apology first up if this has been covered before. I am curious if anyone has tested relinking an AFM cache to a new home where the new home, old home and cache have the exact same data. What is the behaviour? The infocenter and documentation say the cache expects home to be empty. I did a small test and it seems to work but it may have happened too fast for me to notice any data movement. If anyone is interested in the use case, I am attempting to avoid pulling data from production over the link. The idea is to sync the data locally in DR to the cache, and then relink the cache to production. Where prod/dr are gpfs filesystems with a replica set of data. Again its to avoid moving TB?s across the link that are already there. Appreciate the help in advance, Nick _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=nXbwwQdO-Ul1CumnSmAKP5UCePJCaBVsley8z-eLJgw&s=Rho3eJsFXeOseZuGqDzP33yLYKUUpyIA1DUGGtmx_LU&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=ZSOnMkeNsw6v92UHjeMBC3XPHfpzZlHBMAOJcNpXuNE&s=dZGOYMPF40W5oLiOu-cyilyYzFr4tWalJWKjo1D7PsQ&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Apr 24 10:20:41 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 24 Apr 2018 09:20:41 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a "dpnd" next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 11:42:47 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 12:42:47 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Message-ID: Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From makaplan at us.ibm.com Tue Apr 24 13:38:16 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:38:16 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 13:49:05 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 08:49:05 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Tue Apr 24 16:08:03 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:08:03 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, the system is not in action yet. I am just to plan the migrations right now. premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx): /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From UWEFALKE at de.ibm.com Tue Apr 24 16:20:14 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 24 Apr 2018 17:20:14 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: (Sorry, vi commands and my mailing client do not co-operate well ...Pls forgive the pre-mature posting just sent. ) Hi, the system is not in action yet. I am just to plan the migrations right now. 1) + 2) premigration would be (not listing excludes here, actual cos and FS name replaced by nn, xxx), executed once per hour: RULE EXTERNAL POOL 'hsm_aggr_cosnn' EXEC '/opt/hpss/bin/ghi_migrate' OPTS '-a -c nn' /* Migrate all files that are smaller than 1 GB to hpss as aggregates. */ RULE 'toHsm_aggr_cosnn' MIGRATE FROM POOL 'pool1' WEIGHT(CURRENT_TIMESTAMP - ACCESS_TIME) TO POOL 'hsm_aggr_cosnn' SHOW ('-s' FILE_SIZE) WHERE path_name like '/xxx/%' AND FILE_SIZE <= 1073741824 AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '120' MINUTES) Internal Migration was originally intended like RULE "pool0_to_pool1" MIGRATE FROM POOL 'pool0' TO POOL 'pool1' WHERE (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '720' MINUTES) to be ran once per day (plus a thrshold-policy to prevent filling up of the relative small pool0 should something go wrong) 3) + 4) as that is not yet set up - no answer. I would like to prevent such things to happen in the first place, therefore asking. 5) I do not understand your question. I am planning the migration set up, and a suppose there might be the risk for trouble when doing internal migrations while haveing data pre-migrated to external. Indeed I am not a developer of the ILM stuff in Scale, so I cannot fully foresee what'll happen. Therefore asking. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 14:38 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org To address this question or problem, please be more specific about: 1) commands and policy rules used to perform or control the migrations and pre-migrations. 2) how the commands are scheduled, how often, and/or by events and mmXXcallbacks. 3) how many files are "okay" and how many are ill-placed. 4) how many blocks or KB of storage are ill-placed. (One could determine this by looking at pool free/used blocks stats, then running restripe to correct all ill-placements and then looking at pool free/used stats again.) 5) As commands and migrations are executed, what is happening that you do not understand or that you believe is incorrect? Marc K of GPFS (aka "Mister Mmapplypolicy") From: "Uwe Falke" To: "gpfsug main discussion list" Date: 04/24/2018 06:43 AM Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, World, I'd asked this on dW last week but got no reactions: I am unsure on the viability of this approach: In a given file system, there are two internal storage pools and one external (GHI/HPSS). Certain files are created in one internal pool and are to be migrated to the other internal pool about one day after file closure. In addition, all files are to be (pre-)migrated to the external pool within one hour after closure. Hence: typically, blocks of a file being already pre-migrated to external (but not purged on the initial internal pool) are to be moved from one internal pool to the other. A slight indication that there might be some issue with such an approach I've found in a dW post some years back: https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014797943&ps=25 However, time has passed and this might work now or might not. Is there some knowledge about that in the community? If it would not be a reasonable way, we would do the internal migration before the external one, but that imposes a timely dependance we'd not have otherwise. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=j-JJhNxx8XhgizXxQDuJolplgYxVRVTS6zalCPshD-0&s=oHJJ8ZT4qE3GTSPKyNRVdybeeCQHTeyoDiLg5CvC5JM&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Tue Apr 24 17:00:39 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 12:00:39 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: I added support to mmapplypolicy & co for HPSS (and TSM/HSM) years ago. AFAIK, it works pretty well, and pretty much works in a not-too-surprising way. I believe that besides the Spectrum Scale Admin and Command docs there are some "red book"s to help, and some IBM support people know how to use these features. When using mmapplypolicy: a) Migration and pre-migration to an external pool requires at least two rules: An EXTERNAL POOL rule to define the external pool, say 'xpool' and a MIGRATE ... TO POOL 'xpool' rule. b) Migration between GPFS native pools requires at least one MIGRATE ... FROM POOL 'apool' TO POOL 'bpool' rule. c) During any one execution of mmapplypolicy any one particular file will be subject of at most one MIGRATE rule. In other words file x will be either (pre)MIGRATEd to an external pool. OR MIGRATED between gpfs pools. BUT not both. (Hmm... well you could write a custom migration script and name that script in your EXTERNAL POOL rule and do anything you like to each file x that is chosen for "external" MIGRATion... But still just one MIGRATE rule governs file x.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Apr 24 22:10:31 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 24 Apr 2018 17:10:31 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Wed Apr 25 00:10:52 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 01:10:52 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: Hi, Marc, thx. I understand that being premigrated to an external pool should not affect the internal migration of a file. FYI: This is not the typical "gold - silver - bronze" setup with a one-dimensional migration path. Instead, one of the internal pools (pool0) is used to receive files written in very small records, the other (pool1) is the "normal" pool and receives all other files. Files written to pool0 should move to pool1 once they are closed (i.e. complete), but pool 0 has enough capacity to live without off-migration to pool1 for a few days, thus I'd thought to keep the frequency of that migration to not more than once per day. The external pool serves as a remote async mirror to achieve some resiliency against FS failures and also unintentional file deletion (metadata / SOBAR backups and file listings to keep the HPSS coordinates of GPFS files are done regularly), only in the long run data will be purged from pool1. Thus, migration to external should be done in shorter intervals. Sounds like I can go ahead without hesitation. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Marc A Kaplan" To: gpfsug main discussion list Date: 24/04/2018 23:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org Uwe also asked: whether it is unwise to have the external and the internal migrations in an uncoordinated fashion so that it might happen some files have been migrated to external before they undergo migration from one internal pool (pool0) to the other (pool1) That's up to the admin. IOW coordinate it as you like or not at all, depending on what you're trying to acomplish.. But the admin should understand... Whether you use mmchattr -P newpool or mmapplypolicy/Migrate TO POOL 'newpool' to do an internal, GPFS pool to GPFS pool migration there are two steps: A) Mark the newly chosen, preferred newpool in the file's inode. Then, as long as any data blocks are on GPFS disks that are NOT in newpool, the file is considered "ill-placed". B) Migrate every datablock of the file to 'newpool', by allocating a block in newpool, copy a block of data, updating the file's data pointers, etc, etc. If you say "-I defer" then only (A) is done. You can force (B) later with a restripeXX command. If you default or say "-I yes" then (A) is done and (B) is done as part of the work of the same command (mmchattr or mmapplypolicy) - (If the command is interrupted, (B) may happen for some subset of the data blocks, leaving the file "ill-placed") Putting "external" storage into the mix -- you can save time and go faster - if you migrate completely and directly from the original pool - skip the "internal" migrate! Maybe if you're migrating but leaving a first block "stub" - you'll want to migrate to external first, and then migrate just the one block "internally"... On the other hand, if you're going to keep the whole file on GPFS storage for a while, but want to free up space in the original pool, you'll want to migrate the data to a newpool at some point... In that case you might want to pre-migrate (make a copy on HSM but not free the GPFS copy) also. Should you pre-migrate from the original pool or the newpool? Your choice! Maybe you arrange things so you pre-migrate while the data is on the faster pool. Maybe it doesn't make much difference, so you don't even think about it anymore, now that you understand that GPFS doesn't care either! ;-) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From valdis.kletnieks at vt.edu Wed Apr 25 01:09:52 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Tue, 24 Apr 2018 20:09:52 -0400 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: References: Message-ID: <108483.1524614992@turing-police.cc.vt.edu> On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From UWEFALKE at de.ibm.com Wed Apr 25 08:48:18 2018 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 25 Apr 2018 09:48:18 +0200 Subject: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external In-Reply-To: <108483.1524614992@turing-police.cc.vt.edu> References: <108483.1524614992@turing-police.cc.vt.edu> Message-ID: Hi, we rely on some scheme of file names. Splitting by path / fileset does not work here as small- and large-record data have to be co-located. Small-record files will only be recognised if carrying some magic strings in the file name. This is not a normal user system, but ingests data generated automatically, and thus a systematic naming of files is possible to a large extent. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: valdis.kletnieks at vt.edu To: gpfsug main discussion list Date: 25/04/2018 02:10 Subject: Re: [gpfsug-discuss] ILM: migrating between internal pools while premigrated to external Sent by: gpfsug-discuss-bounces at spectrumscale.org On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said: > Instead, one of the internal pools (pool0) is used to receive files > written in very small records, the other (pool1) is the "normal" pool and > receives all other files. How do you arrange that to happen? As we found out on one of our GPFS clusters, you can't use filesize as a criterion in a file placement policy because it has to pick a pool before it knows what the final filesize will be. (The obvious-to-me method is to set filesets pointed at pools, and then attach fileset to pathnames, and then tell the users "This path is for small files, this one is for other files" and thwap any who get it wrong with a clue-by-four. ;) [attachment "attnayq3.dat" deleted by Uwe Falke/Germany/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 09:46:40 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 10:46:40 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Message-ID: Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano From r.sobey at imperial.ac.uk Wed Apr 25 10:01:33 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 09:01:33 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Wed Apr 25 10:42:08 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 25 Apr 2018 09:42:08 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Wed Apr 25 11:44:46 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 10:44:46 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: Message-ID: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Hallo Ivano, we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). # The query port number defaults to 9084. queryport = "9084" query2port = "9094" We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo Gesendet: Mittwoch, 25. April 2018 10:47 An: gpfsug main discussion list Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hi all, I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 together with the latest grafana bridge (version 3). At the UK UG meeting I learned that this is the multi-threaded setup, so hopefully we can get better performances. But we are having a problem. Our existing grafana dashboard have metrics like eg. "hostname|CPU|cpu_user". It was working and it also had a very helpful completion when creating new graphs. After the upgrade these metrics are not recognized anymore, and we are getting the following errors in the grafana bridge log file: 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if the corresponding sensor is configured The only way I found to make them work is using only the real metric name, eg "cpu_user" and then use filter to restrict to a host ('node'='hostname'). The problem is that in many cases the metric is complex, eg. you want to restrict to a filesystem, to a fileset, to a network interface. And is not easy to get the field names to be used in the filters. So my questions are: - is this supposed to be like that or the old metrics name can be enabled somehow? - if it has to be like that, how can I get the available field names to use in the filters? And then I saw in the new collector config file this: queryport = "9084" query2port = "9094" Which one should be used by the bridge? Thank you, Ivano _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Ivano.Talamo at psi.ch Wed Apr 25 12:37:02 2018 From: Ivano.Talamo at psi.ch (Ivano Talamo) Date: Wed, 25 Apr 2018 13:37:02 +0200 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Renar.Grunenberg at huk-coburg.de Wed Apr 25 13:11:53 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Wed, 25 Apr 2018 12:11:53 +0000 Subject: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 In-Reply-To: References: <0249ecfd68714722a1d90ec196b711b5@SMXRF105.msg.hukrf.de> Message-ID: <0815ca423fad48f9b3f149cd2eb1b143@SMXRF105.msg.hukrf.de> Hallo Ivano We use filter only. For cpu_user ->node = pm_filter($byNode) Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= -----Urspr?ngliche Nachricht----- Von: Ivano Talamo [mailto:Ivano.Talamo at psi.ch] Gesendet: Mittwoch, 25. April 2018 13:37 An: gpfsug main discussion list ; Grunenberg, Renar Betreff: Re: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 Hello Renar, I also changed the bridge to openTSDB 2.3 and set it use query2port. I was only not sure that this was the multi-threaded one. But are you using the pipe-based metrics (like "hostname|CPU|cpu_user") or you use filters? Thanks, Ivano Il 25/04/18 12:44, Grunenberg, Renar ha scritto: > Hallo Ivano, > > we change the bridge port to query2port this is the multithreaded Query port. The Bridge in Version3 select these port automatically if the pmcollector config is updated(/opt/IBM/zimon/ZIMonCollector.cfg). > # The query port number defaults to 9084. > queryport = "9084" > query2port = "9094" > We use 5.0.0.2 here. What we also change was in the Datasource Panel for the bridge in Grafana the openTSDB to ==2.3. Hope this help. > > Regards Renar > > > > Renar Grunenberg > Abteilung Informatik ? Betrieb > > HUK-COBURG > Bahnhofsplatz > 96444 Coburg > Telefon: 09561 96-44110 > Telefax: 09561 96-44104 > E-Mail: Renar.Grunenberg at huk-coburg.de > Internet: www.huk.de > ======================================================================= > HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg > Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 > Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg > Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. > Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. > ======================================================================= > Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. > Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, > informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. > Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. > > This information may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this information in error) please notify the > sender immediately and destroy this information. > Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. > ======================================================================= > > -----Urspr?ngliche Nachricht----- > Von: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] Im Auftrag von Ivano Talamo > Gesendet: Mittwoch, 25. April 2018 10:47 > An: gpfsug main discussion list > Betreff: [gpfsug-discuss] problems with collector 5 and grafana bridge 3 > > Hi all, > > I am actually testing the collector shipped with Spectrum Scale 5.0.0-1 > together with the latest grafana bridge (version 3). At the UK UG > meeting I learned that this is the multi-threaded setup, so hopefully we > can get better performances. > > But we are having a problem. Our existing grafana dashboard have metrics > like eg. "hostname|CPU|cpu_user". It was working and it also had a very > helpful completion when creating new graphs. > After the upgrade these metrics are not recognized anymore, and we are > getting the following errors in the grafana bridge log file: > > 2018-04-25 09:35:24,999 - zimonGrafanaIntf - ERROR - Metric > hostnameNetwork|team0|netdev_drops_s cannot be found. Please check if > the corresponding sensor is configured > > The only way I found to make them work is using only the real metric > name, eg "cpu_user" and then use filter to restrict to a host > ('node'='hostname'). The problem is that in many cases the metric is > complex, eg. you want to restrict to a filesystem, to a fileset, to a > network interface. And is not easy to get the field names to be used in > the filters. > > So my questions are: > - is this supposed to be like that or the old metrics name can be > enabled somehow? > - if it has to be like that, how can I get the available field names to > use in the filters? > > > And then I saw in the new collector config file this: > > queryport = "9084" > query2port = "9094" > > Which one should be used by the bridge? > > Thank you, > Ivano > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From david_johnson at brown.edu Wed Apr 25 13:14:27 2018 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Wed, 25 Apr 2018 08:14:27 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson > On Apr 25, 2018, at 5:42 AM, Daniel Kidger wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Wed Apr 25 13:35:00 2018 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Wed, 25 Apr 2018 12:35:00 +0000 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! Main reason for us NOT to use dependent filesets is lack of snapshotting. Richard From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of david_johnson at brown.edu Sent: 25 April 2018 13:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: 1) limited number independent filesets could be created compared to dependent 2) requirement to manage number of inodes allocated to each and every independent fileset There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. -- ddj Dave Johnson On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? A related question though: In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. Daniel [IBM Storage Professional Badge] [Image removed by sender.] Dr Daniel Kidger IBM Technical Sales Specialist Software Defined Solution Sales +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list > Cc: Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Date: Wed, Apr 25, 2018 10:01 AM Hi Marc Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? Richard From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan Sent: 24 April 2018 13:49 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent To help make sense of this, one has to understand that "independent" means a different range of inode numbers. If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 04/24/2018 05:20 AM Subject: [gpfsug-discuss] Converting a dependent fileset to independent Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi all, Is there any way, without starting over, to convert a dependent fileset to independent? My gut says no but in the spirit of not making unnecessary work I wanted to ask. Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L Filesets in file system 'gpfs': Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ~WRD000.jpg Type: image/jpeg Size: 823 bytes Desc: ~WRD000.jpg URL: From david_johnson at brown.edu Wed Apr 25 14:07:19 2018 From: david_johnson at brown.edu (David Johnson) Date: Wed, 25 Apr 2018 09:07:19 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > > <~WRD000.jpg> > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org '" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From valdis.kletnieks at vt.edu Wed Apr 25 15:26:08 2018 From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu) Date: Wed, 25 Apr 2018 10:26:08 -0400 Subject: [gpfsug-discuss] Encryption and ISKLM Message-ID: <22132.1524666368@turing-police.cc.vt.edu> We're running GPFS 4.2.3.7 with encryption on disk, LTFS/EE 1.2.6.2 with encryption on tape, and ISKLM 2.6.0.2 to manage the keys. I'm in the middle of researching RHEL patches on the key servers. Do I want to stay at 2.6.0.2, or go to a later 2.6, or jump to 2.7 or 3.0? Not seeing a lot of guidance on that topic.... From truongv at us.ibm.com Wed Apr 25 18:44:16 2018 From: truongv at us.ibm.com (Truong Vu) Date: Wed, 25 Apr 2018 13:44:16 -0400 Subject: [gpfsug-discuss] Converting a dependent fileset to independent In-Reply-To: References: Message-ID: >>> There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered Regarding the above query, it was answered on 5 Feb, 2018 > Subject: Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5? > Date: Mon, Feb 5, 2018 2:56 PM > Quoting "Truong Vu" : > >> >> Hi Jamie, >> >> The limits are the same in 5.0.0. We'll look into the FAQ. >> >> Thanks, >> Tru. BTW, the FAQ has been has been tweak a bit in this area. Thanks, Tru. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 04/25/2018 09:09 AM Subject: gpfsug-discuss Digest, Vol 75, Issue 48 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Converting a dependent fileset to independent (David Johnson) ---------------------------------------------------------------------- Message: 1 Date: Wed, 25 Apr 2018 09:07:19 -0400 From: David Johnson To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent Message-ID: Content-Type: text/plain; charset="utf-8" Yes, independent snapshotting would be an issue. However at the moment we have 570 dependent filesets in our main filesystem, which is not all that far from the limit of 1000 independent filesets per filesystem. There was a thread concerning fileset issues back in February, wondering if the limits had changed between 4.x to 5.0.0 releases, but it went unanswered. We would love to be able to use the features independent filesets (quicker traversal by policy engine, snapshots as you mention, etc), but the thought that we could run out of them as our user base grows killed that idea. > On Apr 25, 2018, at 8:35 AM, Sobey, Richard A wrote: > > You can apply fileset quotas to independent filesets to you know. Sorry if that sounds passive aggressive, not meant to be! > > Main reason for us NOT to use dependent filesets is lack of snapshotting. > > Richard > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of david_johnson at brown.edu > Sent: 25 April 2018 13:14 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > We use a dependent fileset for each research group / investigator. We do this mainly so we can apply fileset quotas. We tried independent filesets but they were quite inconvenient: > 1) limited number independent filesets could be created compared to dependent > 2) requirement to manage number of inodes allocated to each and every independent fileset > > There may have been other issues but we create a new dependent fileset whenever a new researcher joins our cluster. > -- ddj > Dave Johnson > > On Apr 25, 2018, at 5:42 AM, Daniel Kidger > wrote: > > It would though be a nice to have feature: to leave the file data where it was and just move the metadata into its own inode space? > > A related question though: > In what case do people create new *dependant* filesets ? I can't see many cases where an independent fileset would be just as valid. > Daniel > > > < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > <~WRD000.jpg> < https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youracclaim.com_user_danel-2Dkidger&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=ctx0N2K6fykHKd_4sWHgXk0eRcJLrWcWvCYS1ea7o-s&e= > > > > > Dr Daniel Kidger > IBM Technical Sales Specialist > Software Defined Solution Sales > > +44-(0)7818 522 266 > daniel.kidger at uk.ibm.com > > > > ----- Original message ----- > From: "Sobey, Richard A" > > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > Date: Wed, Apr 25, 2018 10:01 AM > > > Hi Marc > > > > Yes, copying the data to a freshly created fileset and unlinking/renaming/relinking is our solution; it was always a long shot expecting to be able to convert it ? > > > > Richard > > > > From: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > On Behalf Of Marc A Kaplan > Sent: 24 April 2018 13:49 > To: gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Converting a dependent fileset to independent > > > > To help make sense of this, one has to understand that "independent" means a different range of inode numbers. > > If you have a set of files within one range of inode numbers, say 3000-5000 and now you want to move some of them to a new range of inode numbers, say 7000-8000, you're going to have to create that new range as a new independent fileset, and then move (copy!) the files to the new fileset. And then rename directories so that you can once again refer to the files by the pathnames that they "used to" have. > > During the copying and renaming, you would have to make sure there are no applications trying to access those files and directories. > > > > From: "Sobey, Richard A" > > To: "'gpfsug-discuss at spectrumscale.org < mailto:gpfsug-discuss at spectrumscale.org>'" > > Date: 04/24/2018 05:20 AM > Subject: [gpfsug-discuss] Converting a dependent fileset to independent > Sent by: gpfsug-discuss-bounces at spectrumscale.org < mailto:gpfsug-discuss-bounces at spectrumscale.org> > > > > Hi all, > > Is there any way, without starting over, to convert a dependent fileset to independent? > > My gut says no but in the spirit of not making unnecessary work I wanted to ask. > > Also, the documentation states that I should see a ?dpnd? next to a dependent fileset when I run mmlsfileset with -L; this is not the case even though the parent fileset is root in this case. > > [root at nsd bin]# mmlsfileset gpfs studentrecruitmentandoutreach -L > Filesets in file system 'gpfs': > Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment > studentrecruitmentandoutreach 241 8700824 0 Wed Feb 14 14:25:49 2018 0 0 0 > > Thanks > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=W9WH1n93tRRhW5npPGd73Dqgcp4d0oAYKt4yOI02PWU&s=MnsN-hjjhZirIDfK1k-awRB9hodsX2Ylh1Z1IzoJij0&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E&m=q3YHYji1-ohlnu_yLuOGmxLnmk9znaxUJdg04aGLU_U&s=Axp36hWpcuKZbTMbdVde_ifZreMvOlR5sHUYX3A79hQ&e= > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= >_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org < https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=N4Mu7JvqpvzZ7EN4r2Qj2Zafsn1e4kbbgjmWJWwKVgc&e= > > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=s-Q5O2mSZkgTv9Vw1sikGpoIyxbhCqQ0mpMD-M_8f_E&e= > -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: < https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180425_908762a1_attachment.p7s&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=TM2ZRp5gYFZ5sI3I0Obdf5h-aNBKNzm9tuWX3rqepaM&e= > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=S98LocHjHQe1pGot4UAs0sYXj1NhEPyEXoEcqsDxCtM&s=FtLzwY_unGN6f6WlAmJFeDFnMICYLyNlAAIxft_8wgM&e= End of gpfsug-discuss Digest, Vol 75, Issue 48 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ulmer at ulmer.org Thu Apr 26 03:48:23 2018 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 25 Apr 2018 22:48:23 -0400 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Message-ID: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Thu Apr 26 07:36:13 2018 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Thu, 26 Apr 2018 09:36:13 +0300 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? In-Reply-To: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> References: <7B09F35D-9172-4A07-8480-5829562629DD@ulmer.org> Message-ID: Hi You knew the answer, still is no. https://www.mail-archive.com/gpfsug-discuss at spectrumscale.org/msg02249.html -- Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations Luis Bolinches Consultant IT Specialist Mobile Phone: +358503112585 https://www.youracclaim.com/user/luis-bolinches "If you always give you will always have" -- Anonymous From: Stephen Ulmer To: gpfsug main discussion list Date: 26/04/2018 05:58 Subject: [gpfsug-discuss] Will GPFS recognize re-sized LUNs? Sent by: gpfsug-discuss-bounces at spectrumscale.org I?m 80% sure that the answer to this is "no", but I promised a client that I?d get a fresh answer anyway. If one extends a LUN that is under an NSD, and then does the OS-level magic to make that known to everyone that could write to it, can the NSD be extended to use the additional space? I can think of lots of reasons why this would be madness, and the implementation would have very little return, but maybe a large customer or grant demanded it at some point? Liberty, -- Stephen _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=1mZ896psa5caYzBeaugTlc7TtRejJp3uvKYxas3S7Xc&m=8_UIPpKhNk91nrRD8-6YIFZZXAX8-cxWiEUSTFLM_rY&s=AMlIQVIzjj6hG0agQvN2AAev3cj2MXe1AvqEpxMvnNU&e= Ellei edell? ole toisin mainittu: / Unless stated otherwise above: Oy IBM Finland Ab PL 265, 00101 Helsinki, Finland Business ID, Y-tunnus: 0195876-3 Registered in Finland -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:20:22 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:20:22 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 15:35:29 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 15:35:29 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Apr 26 15:40:52 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 26 Apr 2018 10:40:52 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > > > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > > > > > Bob Oesterlin > > > Sr Principal Storage Engineer, Nuance > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > Nathan?Harper?//?IT Systems Lead > > > e:?nathan.harper at cfms.org.uk???t:?0117 906 1104??m:? 0787 551 0891??w:?www.cfms.org.uk > CFMS Services Ltd?//?Bristol & Bath Science Park?//?Dirac Crescent?//?Emersons Green?//?Bristol?//?BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office //?43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Apr 26 15:51:30 2018 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 26 Apr 2018 14:51:30 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: Hi Lohit, Nathan Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: on behalf of "valleru at cbio.mskcc.org" Reply-To: gpfsug main discussion list Date: Thursday, April 26, 2018 at 9:45 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS We do run Singularity + GPFS, on our production HPC clusters. Most of the time things are fine without any issues. However, i do see a significant performance loss when running some applications on singularity containers with GPFS. As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. I am yet to raise a PMR about this with IBM. I have not seen performance degradation for any other kind of IO, but i am not sure. Regards, Lohit On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. I'm interested to hear about experience with MPI-IO within Singularity. On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Nathan Harper // IT Systems Lead [Image removed by sender.] e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR [Image removed by sender.] CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at uk.ibm.com Thu Apr 26 15:51:19 2018 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 26 Apr 2018 14:51:19 +0000 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: , <39293A22-ED75-41E8-AC4C-EE5138834F9C@nuance.com> Message-ID: An HTML attachment was scrubbed... URL: From yguvvala at cambridgecomputer.com Thu Apr 26 15:53:58 2018 From: yguvvala at cambridgecomputer.com (Yugendra Guvvala) Date: Thu, 26 Apr 2018 10:53:58 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: Message-ID: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> I am interested to learn this too. So please add me sending a direct mail. Thanks, Yugi > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert wrote: > > Hi Lohit, Nathan > > Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > From: on behalf of "valleru at cbio.mskcc.org" > Reply-To: gpfsug main discussion list > Date: Thursday, April 26, 2018 at 9:45 AM > To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > We do run Singularity + GPFS, on our production HPC clusters. > Most of the time things are fine without any issues. > > However, i do see a significant performance loss when running some applications on singularity containers with GPFS. > > As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) > When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. > I am yet to raise a PMR about this with IBM. > I have not seen performance degradation for any other kind of IO, but i am not sure. > > Regards, > Lohit > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , wrote: > > We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. > > I'm interested to hear about experience with MPI-IO within Singularity. > > On 26 April 2018 at 15:20, Oesterlin, Robert wrote: > Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Thu Apr 26 16:25:54 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 16:25:54 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: Happy to share on the list in case anyone else finds it useful: We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. On 26 April 2018 at 15:53, Yugendra Guvvala wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < > Robert.Oesterlin at nuance.com> wrote: > > Hi Lohit, Nathan > > > > Would you be willing to share some more details about your setup? We are > just getting started here and I would like to hear about what your > configuration looks like. Direct email to me is fine, thanks. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > > *From: * on behalf of " > valleru at cbio.mskcc.org" > *Reply-To: *gpfsug main discussion list > *Date: *Thursday, April 26, 2018 at 9:45 AM > *To: *gpfsug main discussion list > *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS > > > > We do run Singularity + GPFS, on our production HPC clusters. > > Most of the time things are fine without any issues. > > > > However, i do see a significant performance loss when running some > applications on singularity containers with GPFS. > > > > As of now, the applications that have severe performance issues with > singularity on GPFS - seem to be because of ?mmap io?. (Deep learning > applications) > > When i run the same application on bare metal, they seem to have a huge > difference in GPFS IO when compared to running on singularity containers. > > I am yet to raise a PMR about this with IBM. > > I have not seen performance degradation for any other kind of IO, but i am > not sure. > > > Regards, > Lohit > > > On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , > wrote: > > We are running on a test system at the moment, and haven't run into any > issues yet, but so far it's only been 'hello world' and running FIO. > > > > I'm interested to hear about experience with MPI-IO within Singularity. > > > > On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: > > Anyone (including IBM) doing any work in this area? I would appreciate > hearing from you. > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > -- > > *Nathan* *Harper* // IT Systems Lead > > > > [image: Image removed by sender.] > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > > > > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > [image: Image removed by sender.] > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Apr 26 16:31:32 2018 From: david_johnson at brown.edu (David Johnson) Date: Thu, 26 Apr 2018 11:31:32 -0400 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> Message-ID: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Regarding MPI-IO, how do you mean ?building the applications against GPFS?? We try to advise our users about things to avoid, but we have some poster-ready ?chaos monkeys? as well, who resist guidance. What apps do your users favor? Molpro is one of our heaviest apps right now. Thanks, ? ddj > On Apr 26, 2018, at 11:25 AM, Nathan Harper wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering applications, so 95+% of our jobs are multi-node MPI. We have had some questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with GPFS+MPI-IO in the past that was solved by building the applications against GPFS. If users start using Singularity containers, we then can't guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we can break it, before we deploy onto our production systems. Everything seems to be ok under synthetic benchmarks, but I've handed over to one of my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala > wrote: > I am interested to learn this too. So please add me sending a direct mail. > > Thanks, > Yugi > > On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert > wrote: > >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are just getting started here and I would like to hear about what your configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> From: > on behalf of "valleru at cbio.mskcc.org " > >> Reply-To: gpfsug main discussion list > >> Date: Thursday, April 26, 2018 at 9:45 AM >> To: gpfsug main discussion list > >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with singularity on GPFS - seem to be because of ?mmap io?. (Deep learning applications) >> >> When i run the same application on bare metal, they seem to have a huge difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper >, wrote: >> >> >> We are running on a test system at the moment, and haven't run into any issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert > wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> -- >> >> Nathan Harper // IT Systems Lead >> >> >> >> >> >> e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR >> >> >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -- > Nathan Harper // IT Systems Lead > > > > e: nathan.harper at cfms.org.uk t: 0117 906 1104 m: 0787 551 0891 w: www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5458 bytes Desc: not available URL: From nathan.harper at cfms.org.uk Thu Apr 26 17:00:56 2018 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Thu, 26 Apr 2018 17:00:56 +0100 Subject: [gpfsug-discuss] Singularity + GPFS In-Reply-To: <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> References: <7E2C9BEF-7545-4484-A342-2ECF2CD64E38@cambridgecomputer.com> <0C11F444-7196-4B07-9646-5062CC8715B1@brown.edu> Message-ID: We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Apr 26 19:08:48 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Thu, 26 Apr 2018 18:08:48 +0000 Subject: [gpfsug-discuss] Pool migration and replicate Message-ID: Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon From olaf.weiser at de.ibm.com Thu Apr 26 19:53:42 2018 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 26 Apr 2018 11:53:42 -0700 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From vborcher at linkedin.com Thu Apr 26 19:59:38 2018 From: vborcher at linkedin.com (Vanessa Borcherding) Date: Thu, 26 Apr 2018 18:59:38 +0000 Subject: [gpfsug-discuss] Singularity + GPFS Message-ID: <690DC273-833D-419F-84A0-7EE2EC7700C1@linkedin.biz> Hi All, In my previous life at Weill Cornell, I benchmarked Singularity pretty extensively for bioinformatics applications on a GPFS 4.2 cluster, and saw virtually no overhead whatsoever. However, I did not allow MPI jobs for those workloads, so that may be the key differentiator here. You may wish to reach out to Greg Kurtzer and his team too - they're super responsive on github and have a slack channel that you can join. His email address is gmkurtzer at gmail.com. Vanessa ?On 4/26/18, 9:01 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of gpfsug-discuss-request at spectrumscale.org" wrote: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Singularity + GPFS (Nathan Harper) ---------------------------------------------------------------------- Message: 1 Date: Thu, 26 Apr 2018 17:00:56 +0100 From: Nathan Harper To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Singularity + GPFS Message-ID: Content-Type: text/plain; charset="utf-8" We had an issue with a particular application writing out output in parallel - (I think) including gpfs.h seemed to fix the problem, but we might also have had a clockskew issue on the compute nodes at the same time, so we aren't sure exactly which fixed it. My chaos monkeys aren't those that resist guidance, but instead are the ones that will employ all the tools at their disposal to improve performance. A lot of our applications aren't doing MPI-IO, so my very capable parallel filesystem is idling while a single rank is reading/writing. However, some will hit the filesystem much harder or exercise less used functionality, and I'm keen to make sure that works through Singularity as well. On 26 April 2018 at 16:31, David Johnson wrote: > Regarding MPI-IO, how do you mean ?building the applications against > GPFS?? > We try to advise our users about things to avoid, but we have some > poster-ready > ?chaos monkeys? as well, who resist guidance. What apps do your users > favor? > Molpro is one of our heaviest apps right now. > Thanks, > ? ddj > > > On Apr 26, 2018, at 11:25 AM, Nathan Harper > wrote: > > Happy to share on the list in case anyone else finds it useful: > > We use GPFS for home/scratch on our HPC clusters, supporting engineering > applications, so 95+% of our jobs are multi-node MPI. We have had some > questions/concerns about GPFS+Singularity+MPI-IO, as we've had issues with > GPFS+MPI-IO in the past that was solved by building the applications > against GPFS. If users start using Singularity containers, we then can't > guarantee how the contained applications have been built. > > I've got a small test system (2 nsd nodes, 6 compute nodes) to see if we > can break it, before we deploy onto our production systems. Everything > seems to be ok under synthetic benchmarks, but I've handed over to one of > my chaos monkey users to let him do his worst. > > On 26 April 2018 at 15:53, Yugendra Guvvala com> wrote: > >> I am interested to learn this too. So please add me sending a direct >> mail. >> >> Thanks, >> Yugi >> >> On Apr 26, 2018, at 10:51 AM, Oesterlin, Robert < >> Robert.Oesterlin at nuance.com> wrote: >> >> Hi Lohit, Nathan >> >> >> >> Would you be willing to share some more details about your setup? We are >> just getting started here and I would like to hear about what your >> configuration looks like. Direct email to me is fine, thanks. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> >> *From: * on behalf of " >> valleru at cbio.mskcc.org" >> *Reply-To: *gpfsug main discussion list > > >> *Date: *Thursday, April 26, 2018 at 9:45 AM >> *To: *gpfsug main discussion list >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Singularity + GPFS >> >> >> >> We do run Singularity + GPFS, on our production HPC clusters. >> >> Most of the time things are fine without any issues. >> >> >> >> However, i do see a significant performance loss when running some >> applications on singularity containers with GPFS. >> >> >> >> As of now, the applications that have severe performance issues with >> singularity on GPFS - seem to be because of ?mmap io?. (Deep learning >> applications) >> >> When i run the same application on bare metal, they seem to have a huge >> difference in GPFS IO when compared to running on singularity containers. >> >> I am yet to raise a PMR about this with IBM. >> >> I have not seen performance degradation for any other kind of IO, but i >> am not sure. >> >> >> Regards, >> Lohit >> >> >> On Apr 26, 2018, 10:35 AM -0400, Nathan Harper , >> wrote: >> >> We are running on a test system at the moment, and haven't run into any >> issues yet, but so far it's only been 'hello world' and running FIO. >> >> >> >> I'm interested to hear about experience with MPI-IO within Singularity. >> >> >> >> On 26 April 2018 at 15:20, Oesterlin, Robert >> wrote: >> >> Anyone (including IBM) doing any work in this area? I would appreciate >> hearing from you. >> >> >> >> Bob Oesterlin >> >> Sr Principal Storage Engineer, Nuance >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> >> >> >> -- >> >> *Nathan* *Harper* // IT Systems Lead >> >> >> >> [image: Image removed by sender.] >> >> >> >> *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> >> >> >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons >> Green // Bristol // BS16 7FR >> >> [image: Image removed by sender.] >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 >> 4QP >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > > -- > *Nathan Harper* // IT Systems Lead > > > > > *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 > *w: *www.cfms.org.uk > CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons > Green // Bristol // BS16 7FR > > CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 > 4QP > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- *Nathan Harper* // IT Systems Lead *e: *nathan.harper at cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 *w: *www.cfms.org.uk CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 75, Issue 56 ********************************************** From makaplan at us.ibm.com Thu Apr 26 21:30:14 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 26 Apr 2018 16:30:14 -0400 Subject: [gpfsug-discuss] Pool migration and replicate In-Reply-To: References: Message-ID: No need to specify REPLICATE(1), but no harm either. No need to specify a FROM POOL, unless you want to restrict the set of files considered. (consider a system with more than two pools...) If a file is already in the target (TO) POOL, then no harm, we just skip over that file. From: "Simon Thompson (IT Research Support)" To: gpfsug main discussion list Date: 04/26/2018 02:09 PM Subject: [gpfsug-discuss] Pool migration and replicate Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi all, We'd like to move some data from a non replicated pool to another pool, but keep replication at 1 (the fs default is 2). When using an ILM policy, is the default to keep the current replication or use the fs default? I.e.just wondering if I need to include a "REPLICATE(1)" clause. Also if the data is already migrated to the pool, is it still considered by the policy engine, or should I include FROM POOL...? I.e. just wondering what is the most efficient way to target the files. Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=9Ko588DKk_71GheOwRqmDO1vVI24OTUvUBdYv8YHIbU&s=04zxf_-EsPu_LN--gsPx7GEPRsqUW7jIZ1Biov8R3mY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Apr 27 09:40:44 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 27 Apr 2018 10:40:44 +0200 Subject: [gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive In-Reply-To: References: <0081EB235765E14395278B9AE1DF341846510A@MBX214.d.ethz.ch> <4AD44D34-5275-4ADB-8CC7-8E80170DDA7F@brown.edu> Message-ID: Alternative solution we're trying... Create the file /etc/systemd/system/gpfs.service.d/delay.conf containing: [Service] ExecStartPre=/bin/sleep 60 Then I expect we should have long enough delay for infiniband to start before starting gpfs.. -jf On Fri, Mar 16, 2018 at 1:05 PM, Frederick Stock wrote: > I have my doubts that mmdiag can be used in this script. In general the > guidance is to avoid or be very careful with mm* commands in a callback due > to the potential for deadlock. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > stockf at us.ibm.com > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 03/16/2018 04:30 AM > > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports > tobecomeactive > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Thanks Olaf, but we don't use NetworkManager on this cluster.. > > I now created this simple script: > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > #! /bin/bash - > # > # Fail mmstartup if not all configured IB ports are active. > # > # Install with: > # > # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail > --event preStartup --sync --onerror shutdown > # > > for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f > 4- -d " ") > do > grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state > || exit 1 > done > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > > which I haven't tested, but assume should work. Suggestions for > improvements would be much appreciated! > > > > -jf > > > On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <*olaf.weiser at de.ibm.com* > > wrote: > > you can try : > systemctl enable NetworkManager-wait-online > ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' > '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From: Jan-Frode Myklebust <*janfrode at tanso.net* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/15/2018 06:18 PM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > becomeactive > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > I found some discussion on this at > *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25* > and > there it's claimed that none of the callback events are early enough to > resolve this. That we need a pre-preStartup trigger. Any idea if this has > changed -- or is the callback option then only to do a "--onerror > shutdown" if it has failed to connect IB ? > > > On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*stockf at us.ibm.com* > > wrote: > You could also use the GPFS prestartup callback (mmaddcallback) to execute > a script synchronously that waits for the IB ports to become available > before returning and allowing GPFS to continue. Not systemd integrated but > it should work. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *stockf at us.ibm.com* > > > > From: *david_johnson at brown.edu* > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 03/08/2018 07:34 AM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > become active > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > > Until IBM provides a solution, here is my workaround. Add it so it runs > before the gpfs script, I call it from our custom xcat diskless boot > scripts. Based on rhel7, not fully systemd integrated. YMMV! > > Regards, > ? ddj > ??- > [ddj at storage041 ~]$ cat /etc/init.d/ibready > #! /bin/bash > # > # chkconfig: 2345 06 94 > # /etc/rc.d/init.d/ibready > # written in 2016 David D Johnson (ddj *brown.edu* > > ) > # > ### BEGIN INIT INFO > # Provides: ibready > # Required-Start: > # Required-Stop: > # Default-Stop: > # Description: Block until infiniband is ready > # Short-Description: Block until infiniband is ready > ### END INIT INFO > > RETVAL=0 > if [[ -d /sys/class/infiniband ]] > then > IBDEVICE=$(dirname $(grep -il infiniband > /sys/class/infiniband/*/ports/1/link* | head -n 1)) > fi > # See how we were called. > case "$1" in > start) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo -n "Polling for InfiniBand link up: " > for (( count = 60; count > 0; count-- )) > do > if grep -q ACTIVE $IBDEVICE/state > then > echo ACTIVE > break > fi > echo -n "." > sleep 5 > done > if (( count <= 0 )) > then > echo DOWN - $0 timed out > fi > fi > ;; > stop|restart|reload|force-reload|condrestart|try-restart) > ;; > status) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo "$IBDEVICE is $(< $IBDEVICE/state) $(< > $IBDEVICE/rate)" > else > echo "No IBDEVICE found" > fi > ;; > *) > echo "Usage: ibready {start|stop|status|restart| > reload|force-reload|condrestart|try-restart}" > exit 2 > esac > exit ${RETVAL} > ???? > > -- ddj > Dave Johnson > > On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) < > *marc.caubet at psi.ch* > wrote: > > Hi all, > > with autoload = yes we do not ensure that GPFS will be started after the > IB link becomes up. Is there a way to force GPFS waiting to start until IB > ports are up? This can be probably done by adding something like > After=network-online.target and Wants=network-online.target in the systemd > file but I would like to know if this is natively possible from the GPFS > configuration. > > Thanks a lot, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > WHGA/036 > 5232 Villigen PSI > Switzerland > > Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067> > E-Mail: *marc.caubet at psi.ch* > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s= > veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Mon Apr 30 22:11:35 2018 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 30 Apr 2018 17:11:35 -0400 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Message-ID: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Apr 30 22:57:17 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support)) Date: Mon, 30 Apr 2018 21:57:17 +0000 Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts In-Reply-To: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> References: <1516de0f-ba2a-40e7-9aa4-d7ea7bae3edf@Spark> Message-ID: You have been able to do this for some time, though I think it's only just supported. We've been exporting remote mounts since CES was added. At some point we've had two storage clusters supplying data and at least 3 remote file-systems exported over NFS and SMB. One thing to watch, be careful if your CES root is on a remote fs, as if that goes away, so do all CES exports. We do have CES root on a remote fs and it works, just be aware... Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valleru at cbio.mskcc.org [valleru at cbio.mskcc.org] Sent: 30 April 2018 22:11 To: gpfsug main discussion list Subject: [gpfsug-discuss] Spectrum Scale CES and remote file system mounts Hello All, I read from the below link, that it is now possible to export remote mounts over NFS/SMB. https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_protocoloverremoteclu.htm I am thinking of using a single CES protocol cluster, with remote mounts from 3 storage clusters. May i know, if i will be able to export the 3 remote mounts(from 3 storage clusters) over NFS/SMB from a single CES protocol cluster? Because according to the limitations as mentioned in the below link: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_limitationofprotocolonRMT.htm It says ?You can configure one storage cluster and up to five protocol clusters (current limit).? Regards, Lohit