From novosirj at rutgers.edu Thu Jan 2 16:05:40 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 2 Jan 2020 16:05:40 +0000 Subject: [gpfsug-discuss] GPFS 5.0.4.1? Message-ID: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From richard.rupp at us.ibm.com Thu Jan 2 17:18:42 2020 From: richard.rupp at us.ibm.com (RICHARD RUPP) Date: Thu, 2 Jan 2020 12:18:42 -0500 Subject: [gpfsug-discuss] GPFS 5.0.4.1? In-Reply-To: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> References: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Message-ID: The Developer Edition can be access via https://www.ibm.com/us-en/marketplace/scale-out-file-and-object-storage It is a free edition, without support, and should not be used for production. There are licensed editions of 5.0.4.1 with support for production environments. 5.0.4.1 was released on 11/21/19 and it is available on Fix Central for IBM customers under maintenance. Regards, Richard Rupp, Sales Specialist, Phone: 1-347-510-6746 From: Ryan Novosielski To: gpfsug main discussion list Date: 01/02/2020 11:06 AM Subject: [EXTERNAL] [gpfsug-discuss] GPFS 5.0.4.1? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EXL-jEd1jmdzvOIhT87C7SIqmAS9uhVQ6J3kObct4OY&m=0YVtwpJq3PdmnToqO4d_GVOAxzzahyIi1xaFIROEs_w&s=Yv5UWi2D1yQpSOodfwfPq-4FC4iStKj_yXbE25Vrul4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From kkr at lbl.gov Mon Jan 6 23:41:28 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Mon, 6 Jan 2020 15:41:28 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thank you to the 18 wonderful people who filled out the survey. However, there are well more than 18 people at any given UG meeting. Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting Happy New Year. Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose wrote: > > Hello, > > It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. > > Best wishes to all in the new year. > > -Kristy > > > Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faryarag at in.ibm.com Tue Jan 7 10:47:39 2020 From: faryarag at in.ibm.com (Farida Yaragatti1) Date: Tue, 7 Jan 2020 16:17:39 +0530 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Message-ID: Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. ? The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: ? IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Tue Jan 7 11:58:13 2020 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Tue, 7 Jan 2020 11:58:13 +0000 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) In-Reply-To: References: Message-ID: <52b5b5557f3f44ce890fe141b670014b@huk-coburg.de> Hallo Farida, can you check your Links, it seems these doesnt work for the poeples outside the IBM network. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder, Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Farida Yaragatti1 Gesendet: Dienstag, 7. Januar 2020 11:48 An: gpfsug-discuss at spectrumscale.org Cc: Wesley Jones ; Mohsin A Inamdar ; Sumit Kumar43 ; Ricardo Daniel Zamora Ruvalcaba ; Rajan Mishra1 ; Pramod T Achutha ; Rezaul Islam ; Ravindra Sure Betreff: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. > The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: > IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Tue Jan 7 16:32:26 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 16:32:26 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> Message-ID: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: From novosirj at rutgers.edu Tue Jan 7 17:06:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Tue, 7 Jan 2020 17:06:54 +0000 Subject: [gpfsug-discuss] Snapshot migration of any kind? Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 We're in the process of figuring out how to rewrite our fileystems in order to take advantage of the 5.0.x variable subblock size enhancement. However, we keep generally 6 weeks of snapshots as a courtesy to the user community. I assume the answer is no, but is there any option for migrating snapshots, or barring that, any recommended reading for what you /can/ do with a snapshot beyond create/destroy? Thanks in advance. I'm having trouble coming up with any useful search terms. - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXhS6qQAKCRCZv6Bp0Ryx vicZAJsHI/z7DXc8EV+sqExhVwMPomoBSQCgyIHgS1Z7RlhQMYAySvDOINAUWPk= =CqPO -----END PGP SIGNATURE----- From kywang at us.ibm.com Tue Jan 7 17:11:35 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 12:11:35 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=m2_UDb09pxCtr3QQCy-6gDUzpw-o_zJQig_xI3C2_1c&m=5FdKJTgMapLheSzY_a5KkY9OQL5m9TwMBD0Bsdt6p58&s=t7Z10OpvkLnFZB5iiF9k8KGVE4R1yitIwUgFfye2tuU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B820169.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B859563.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B879604.gif Type: image/gif Size: 108 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 17:23:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 17:23:40 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: From kywang at us.ibm.com Tue Jan 7 19:13:13 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:13:13 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B974314.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B187982.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B270995.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B968493.gif Type: image/gif Size: 109 bytes Desc: not available URL: From carlz at us.ibm.com Tue Jan 7 19:28:46 2020 From: carlz at us.ibm.com (Carl Zetie - carlz@us.ibm.com) Date: Tue, 7 Jan 2020 19:28:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation Message-ID: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://www.surveygizmo.com/s3/5356255/ee853c3af96a The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 69557 bytes Desc: image001.png URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:39:22 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:39:22 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: From bbanister at jumptrading.com Tue Jan 7 19:40:43 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 7 Jan 2020 19:40:43 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation In-Reply-To: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> References: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> Message-ID: Hi Carl, Without going through the form completely, is there a short breakdown of what features are available to test in the 5.0.5 beta? -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie - carlz at us.ibm.com Sent: Tuesday, January 7, 2020 1:29 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation [EXTERNAL EMAIL] We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://urldefense.com/v3/__https://www.surveygizmo.com/s3/5356255/ee853c3af96a__;!!GSt_xZU7050wKg!-45kSCmNkDN_VaPV5a_MRw-agaDN2iav0KlVEKh7tgnWfA2U0zeE7zenEXkA3iFaVHxF$ The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com From kywang at us.ibm.com Tue Jan 7 19:50:31 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:50:31 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15076604.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15250423.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15764009.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169856.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169125.gif Type: image/gif Size: 110 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:51:19 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:51:19 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: How do I do that? (thnks!) Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:50 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we st]"Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have th From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 111 bytes Desc: image006.gif URL: From bipcuds at gmail.com Tue Jan 7 20:10:10 2020 From: bipcuds at gmail.com (Keith Ball) Date: Tue, 7 Jan 2020 15:10:10 -0500 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Message-ID: Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: cluster = literal_or(my.cluster.name) , groupBy = false filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 8 12:16:09 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 8 Jan 2020 12:16:09 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From knop at us.ibm.com Wed Jan 8 13:29:57 2020 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 8 Jan 2020 13:29:57 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: , <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu><770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu><4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu><8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image002.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image003.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image004.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image005.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 110 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image006.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 111 bytes Desc: not available URL: From heinrich.billich at id.ethz.ch Wed Jan 8 17:02:18 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 8 Jan 2020 17:02:18 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Message-ID: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From lgayne at us.ibm.com Wed Jan 8 18:15:47 2020 From: lgayne at us.ibm.com (Lyle Gayne) Date: Wed, 8 Jan 2020 18:15:47 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Thu Jan 9 19:27:30 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 19:27:30 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Message-ID: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rafael.Cezario at ibm.com Thu Jan 9 19:48:07 2020 From: Rafael.Cezario at ibm.com (Rafael Cezario) Date: Thu, 9 Jan 2020 16:48:07 -0300 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=efzA7AwXTDdK-0_uRBnvcy8-s5uewdL51EO34qmTe0I&m=xrmNBKF1K7yQh6tWtHfPemfaWt1wOT7LtKK83BFKE7g&s=HYvQUEzWuxhpP9FtEHHhY4ZV-UsGMJpGjccLEVgcPfk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Jan 9 20:24:36 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 9 Jan 2020 15:24:36 -0500 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmb ackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=369ydzzb59Q4zfz0T74pkucjHcKuR63z0UAf2aMqAz0&s=3za7Rn3o9V7oajWNFe-U8PvMH8hQLUyVVrHuFCind0g&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:00:59 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:00:59 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Hi Jonathan, Thanks for you kind reply. Indeed, I can always do that. Best, Razvan -- ?On 1/8/20, 7:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rp2927 at gsb.columbia.edu Thu Jan 9 21:19:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:19:40 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Thanks, I?ll set tonight?s run with that debug flag. Best, Razvan -- From: on behalf of IBM Spectrum Scale Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 3:24 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmbackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for "Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selec]"Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:38:02 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:38:02 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Hi Rafael, This looks awesomely promising, but I can?t find the info your refer here. My SCHEDLOGNAME points to /root/dsmsched.log but there is no file by that name in /root. I have the error and instrumentation logs (dsmerror.log and dsminstr.log) per their options, but not the scheduler. Could it be because I don?t run mmbackup via the TSM scheduler ?! (I run it as a cronjob, inside a little wrapper that takes care of preparing/deleting a snapshot for it). Must I run the scheduler to log the activity of the client? Thanks, Razvan -- From: on behalf of Rafael Cezario Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 2:48 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Fri Jan 10 09:00:53 2020 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Fri, 10 Jan 2020 10:00:53 +0100 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge In-Reply-To: References: Message-ID: Hello Keith, please check for more recent versions of the bridge here: https://github.com/IBM/ibm-spectrum-scale-bridge-for-grafana Also updating Grafana to some newer version could help, found some older reports while searching for the error message. HTH Norbert From: Keith Ball To: gpfsug-discuss at spectrumscale.org Date: 07.01.2020 21:10 Subject: [EXTERNAL] [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): ??? Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: ? cluster = literal_or(my.cluster.name) , groupBy = false ? filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, ?Keith _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=i4V0h7L9ElftZNfcuPIXmAHN2jl5TLcuyFLqtinu4j8&m=cqPhew27KzZmjx-Ai5Xk9NPLgCzZg6M2501wjjZ8ItY&s=jdSYaqQcp-DBBW6D0aax4E_qysldCTvWue3iMUemeuw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From jonathan.buzzard at strath.ac.uk Fri Jan 10 10:17:25 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 10 Jan 2020 10:17:25 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From rp2927 at gsb.columbia.edu Fri Jan 10 15:17:50 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Fri, 10 Jan 2020 15:17:50 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: <81809034-B68D-4E85-8CDD-50B2FE063755@gsb.columbia.edu> __. Yes, I've seen the recommendation in the docs, but failed to see an obvious advantage for my case. I have 4 separate backup jobs (on the same client), for as many filesets, for which I can set separate schedules. I guess (?) I could do the same with the TSM scheduler, but it was simpler this way in the beginning when I setup the system, and nothing pushed me to change it since... __ Razvan -- ?On 1/10/20, 5:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From vpuvvada at in.ibm.com Mon Jan 13 07:39:49 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 13 Jan 2020 13:09:49 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=WwGGO3WlGLmgMZX-tb_xjLEk0paAJ_Tekt6NNrxJgPM&s=_oss6YKaJwm5PEi1xqqpwxOstqR0Pqw6hdhOwZ3gsAw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Jan 13 09:11:39 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 13 Jan 2020 10:11:39 +0100 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Mon Jan 13 11:59:11 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 13 Jan 2020 11:59:11 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Mon Jan 13 16:02:45 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Mon, 13 Jan 2020 16:02:45 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5709E6AE-5DD1-46A1-A1B7-C24BF6FFAF84@gsb.columbia.edu> Thanks Uli, I ran the backup with the flag mentioned by {the GPFS team} (thanks again, guys!!) and found the internal list files -- all super fine. I plan to keep that flag in place for a while, to have that info when I might need it (the large files that kept being backed up, and I wanted to trace, just disappeared... __ ) Razvan -- ?On 1/13/20, 4:11 AM, "Ulrich Sibiller" wrote: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From neil.wilson at metoffice.gov.uk Tue Jan 14 15:27:54 2020 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Tue, 14 Jan 2020 15:27:54 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. Message-ID: Hi All, We are occasionally seeing an issue where an mmapplypolicy list job gets stuck , all its doing is generating a listing from a fileset. The problem occurs intermittently and doesn't seem to show any particular pattern ( i.e. not always on the same fileset) The policy job shows the usual output but then outputs the following until the process is killed. [I] 2020-01-08 at 03:05:30.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:05:45.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:00.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:15.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:30.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:45.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:00.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:15.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:30.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:45.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:00.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:15.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:30.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:45.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:00.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:15.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:30.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:45.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:00.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:15.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:30.479 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:45.480 Directory entries scanned: 0. [I] 2020-01-08 at 03:11:00.481 Directory entries scanned: 0. Have any of you come across an issue like this before? Kind regards Neil Neil Wilson? Senior IT Practitioner Storage, Virtualisation and Mainframe Team?? IT Services Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom From vpuvvada at in.ibm.com Tue Jan 14 16:50:17 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 14 Jan 2020 22:20:17 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=20ckIS70wRlogBxSX5WD9bOfsKUGUmKmBAWo7o3UIxQ&s=vGFKxKzbzDKaO343APy97QWJPnsfSSxhWz8qCVCnVqo&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Tue Jan 14 18:35:05 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Tue, 14 Jan 2020 18:35:05 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:21:12 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:21:12 +0000 Subject: [gpfsug-discuss] London User Group Message-ID: Hi All, Just a date for your diary, the UK/WW/London user group will be taking place 13th/14th May. In addition to this, we?re also running an introductory day on 12th May for those recently acquainted with Spectrum Scale. Please mark the dates in your diary! If you have any topics you would like to hear about in London (or any of the other WW user groups) please let me know. Please also take some time to think about if you could provide a site-update or user talk for the event. The feedback we get is that people want to hear more of these, but we can only do this if you are prepared to volunteer a talk. Everyone has something to say about their site deployment, maybe you want to talk about what you are doing with Scale, how you found deployment, or the challenges you face. Finally, as in the past few years, we are looking for sponsors of the UK event, this funds our evening social/networking event which has been a great success over the past few years as he group has grown in size. I will be contacting companies who have supported us in the past, but please also drop me an email if you are interested in sponsoring the group and I will ensure I share the details of the sponsorship offering with you ? when we advertise sponsorship, it will be offered on a first come, first served basis. Thanks Simon (UK/group chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:25:20 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:25:20 +0000 Subject: [gpfsug-discuss] #spectrum-discover Slack channel Message-ID: We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 15 14:55:53 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 15 Jan 2020 14:55:53 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? Message-ID: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Wed Jan 15 18:30:26 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 18:30:26 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Message-ID: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Yum generally only wants there to be single version of any package (it is trying to eliminate conflicting provides/depends so that all of the packaging requirements are satisfied). So this alien packaging practice of installing an efix version of a package over the top of the base version is not compatible with yum. The real issue for draconian sysadmins like us (whose systems must use and obey yum) is that there are files (*liblum.so) which are provided by the non-efix RPMS, but are not owned by the packages according to the RPM database since they?re purposefully installed outside of RPM?s tracking mechanism. We work around this by repackaging the three affected RPMS to include the orphaned files from the original RPMs (and eliminating the related but problematic checks from the RPMs? scripts) so that our efix RPMs have been ?un-efix-ified? and will install as expected when using ?yum upgrade?. To my knowledge no one?s published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. IBM isn?t the only vendor who is ?bad at packaging? from a sysadmin?s point of view, but they are the only one which owns RedHat (who are the de facto masters of RPM/YUM/DNF packaging) so this should probably get better one day. ? Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Billich Heinrich Rainer (ID SD) Sent: Wednesday, January 15, 2020 09:56 To: gpfsug main discussion list Subject: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 15 19:10:20 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 15 Jan 2020 19:10:20 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Message-ID: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From kkr at lbl.gov Wed Jan 15 18:20:04 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Wed, 15 Jan 2020 10:20:04 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Now there are 27 wonderful people who have completed the poll. I will close it today, EOB. Please take the 2 minutes to fill it out before it closes. https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Wed Jan 15 20:59:33 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 15:59:33 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=d-mEUJTkUy0f2Cth1wflA_xI_HiCKrrKZ_-SAjf2z5Q&s=wkv8CcIBgPcGbuG-aIGgcWZoZqzb6FvvjmKX-V728wE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From novosirj at rutgers.edu Wed Jan 15 21:10:59 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Wed, 15 Jan 2020 21:10:59 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Message-ID: Hi there, I know some of the Spectrum Scale developers look at this list. I?m having a little trouble with support on this problem. We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM guests with a portability layer that has been installed via gpfs.gplbin RPMs that we built at our site and have used to install GPFS all over our environment. We?ve not seen this problem so far on any physical hosts, but have now experienced it on guests running on number of our KVM hypervisors, across vendors and firmware versions, etc. At one time I thought it was all happening on systems using Mellanox virtual functions for Infiniband, but we?ve now seen it on VMs without VFs. There may be an SELinux interaction, but some of our hosts have it disabled outright, some are Permissive, and some were working successfully with 5.0.2.x GPFS. What I?ve been instructed to try to solve this problem has been to run ?mmbuildgpl?, and it has solved the problem. I don?t consider running "mmbuildgpl" a real solution, however. If RPMs are a supported means of installation, it should work. Support told me that they?d seen this solve the problem at another site as well. Does anyone have any more information about this problem/whether there?s a fix in the pipeline, or something that can be done to cause this problem that we could remedy? Is there an easy place to see a list of eFixes to see if this has come up? I know it?s very similar to a problem that happened I believe it was after 5.0.2.2 and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. Below is a sample of the crash output: [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] [] kfree+0x13c/0x140 [ 156.760749] RSP: 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ 156.775154] [] cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] [] _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvPP10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 [mmfs26] [ 156.779378] [] _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_PcjjjP10ext_cred_t+0x46a/0x7e0 [mmfs26] [ 156.781689] [] ? _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 [mmfs26] [ 156.783565] [] _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 [mmfs26] [ 156.786228] [] _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7FilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 [mmfs26] [ 156.788681] [] ? _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 [mmfs26] [ 156.790448] [] _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVattr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 [mmfs26] [ 156.793032] [] ? _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 [mmfslinux] [ 156.795838] [] ? _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6c0 [mmfs26] [ 156.797753] [] ? __d_alloc+0x122/0x180 [ 156.798763] [] ? d_alloc+0x60/0x70 [ 156.799700] [] lookup_real+0x23/0x60 [ 156.800651] [] __lookup_hash+0x42/0x60 [ 156.801675] [] lookup_slow+0x42/0xa7 [ 156.802634] [] link_path_walk+0x80f/0x8b0 [ 156.803666] [] path_lookupat+0x7a/0x8b0 [ 156.804690] [] ? lru_cache_add+0xe/0x10 [ 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ 156.806766] [] ? getname_flags+0x4f/0x1a0 [ 156.807817] [] filename_lookup+0x2b/0xc0 [ 156.808834] [] user_path_at_empty+0x67/0xc0 [ 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ 156.811017] [] user_path_at+0x11/0x20 [ 156.811983] [] vfs_fstatat+0x63/0xc0 [ 156.812951] [] SYSC_newstat+0x2e/0x60 [ 156.813931] [] ? trace_do_page_fault+0x56/0x150 [ 156.815050] [] SyS_newstat+0xe/0x10 [ 156.816010] [] system_call_fastpath+0x25/0x2a [ 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 [ 156.822192] RIP [] kfree+0x13c/0x140 [ 156.823180] RSP [ 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From Paul.Sanchez at deshaw.com Wed Jan 15 22:35:23 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 22:35:23 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: This reminds me that there is one more thing which drives the convoluted process I described earlier? Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that?s not the case for everyone.) -Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: Wednesday, January 15, 2020 16:00 To: gpfsug main discussion list Cc: gpfsug-discuss-bounces at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From scale at us.ibm.com Wed Jan 15 23:50:50 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 18:50:50 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: When requesting efix, you can inform the service personnel that you need efix RPMs which don't have dependencies on the base-version. Our service team should be able to provide the appropriate efix RPMs that meet your needs. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/15/2020 05:35:23 PM: > From: "Sanchez, Paul" > To: gpfsug main discussion list > Cc: "gpfsug-discuss-bounces at spectrumscale.org" bounces at spectrumscale.org> > Date: 01/15/2020 05:34 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > This reminds me that there is one more thing which drives the > convoluted process I described earlier? > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efixRPMS and > in that order. IIRC, there were no RPM dependencies connecting the > efix RPMs to their base-version equivalents, so there was nothing to > signal YUM that installing the efix requires that the base-version > be installed first. > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that?s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > This message was sent by an external party. > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have > incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please > contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > The forum is informally monitored as time permits and should not be > used for priority messages to the Spectrum Scale (GPFS) team. > > [image removed] Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/ > 01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there > to be single version of a > > From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they?re purposefully installed outside of RPM?s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs? scripts) so that our efix RPMs > > have been ?un-efix-ified? and will install as expected when using ?yum > > upgrade?. To my knowledge no one?s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=aSasL0r- > NxIT9nDrkoQO6rcyV88VUM_oc6mYssN-_Ng&s=4- > wB8cR24x2P7Rpn_14fIXuwxCvvqwne7xcIp85dZoI&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From frankli at us.ibm.com Thu Jan 16 04:47:43 2020 From: frankli at us.ibm.com (Frank N Lee) Date: Wed, 15 Jan 2020 22:47:43 -0600 Subject: [gpfsug-discuss] #spectrum-discover Slack channel In-Reply-To: References: Message-ID: Simon, Thanks for launching this Slack channel! Copying some of my colleagues who works with Discover. Frank Frank Lee, PhD IBM Systems Group 314-482-5329 | @drfranknlee From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 01/14/2020 02:25 PM Subject: [EXTERNAL] [gpfsug-discuss] #spectrum-discover Slack channel Sent by: gpfsug-discuss-bounces at spectrumscale.org We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HIs14G9Qcs5MqpsAFL5E0TH5hqFD-KbquYdQ_mTmTnI&m=ncahEw2s7R3QIxk7C4IZw2JyOd4_8dFtsAueY6L6dF8&s=fTL2YhTgik5-QpcxEHpoJLO5A9FfOF2ZyNK09_Zxfbc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From phgrau at zedat.fu-berlin.de Thu Jan 16 10:51:43 2020 From: phgrau at zedat.fu-berlin.de (Philipp Grau) Date: Thu, 16 Jan 2020 11:51:43 +0100 Subject: [gpfsug-discuss] Welcome to the "gpfsug-discuss" mailing list In-Reply-To: References: Message-ID: <20200116105143.GA278757@CIS.FU-Berlin.DE> Hello, as requested: * gpfsug-discuss-request at spectrumscale.org [15.01.20 13:40]: > Please introduce yourself to the members with your first post. I'm Philipp from Berlin, Germany. The IT department of the "Freie Universit?t Berlin" is working place. We have a DDN-System with some PB of storage, and GPFS nodes for exporting the space. The use case is "scientific storage", reseach data and stuff (no home or group shares). Regards, Philipp From knop at us.ibm.com Thu Jan 16 13:41:58 2020 From: knop at us.ibm.com (Felipe Knop) Date: Thu, 16 Jan 2020 13:41:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Thu Jan 16 15:32:27 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Thu, 16 Jan 2020 15:32:27 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > To: "gpfsug-discuss at spectrumscale.org" > > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they???re purposefully installed outside of RPM???s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > have been ???un-efix-ified??? and will install as expected when using ???yum > > upgrade???. To my knowledge no one???s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From bbanister at jumptrading.com Thu Jan 16 17:12:04 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 16 Jan 2020 17:12:04 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: We actually add an ExecStartPre directive override (e.g. /etc/systemd/system/gpfs.service.d/gpfs.service.conf) to the gpfs.service [Service] section that points to a simple script that does a check of the GPFS RPMs installed on the system and updates them to what our config management specifies should be installed (a simple txt file in /etc/sysconfig namespace), which ensures that GPFS RPMs are updated before GPFS is started, while GPFS is still down. Works very well for us. The script also does some other checks and updates too, such as adding the node into the right GPFS cluster if needed. Hope that helps, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: Thursday, January 16, 2020 9:32 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? [EXTERNAL EMAIL] Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit > installing two versions/releases for the same (non-kernel) package > name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum > Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ---------------------------------------------------------------------- > -------------------------------------------- > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://urldefense.com/v3/__https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4kUZwbFD$ . > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan > Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul > wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > > To: > "gpfsug-discuss at spectrumscale.org org>" > org>> > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: > gpfsug-discuss-bounces at spectrumscale.org @spectrumscale.org> > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package > > (it is trying to eliminate conflicting provides/depends so that all > > of the packaging requirements are satisfied). So this alien > > packaging practice of installing an efix version of a package over > > the top of the base version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow > one to know which version of GPFS you happen to have installed on a > node without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must > > use and obey yum) is that there are files (*liblum.so) which are > > provided by the non-efix RPMS, but are not owned by the packages > > according to the RPM database since they???re purposefully installed > > outside of RPM???s tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf > then start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you > purge the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to > > include the orphaned files from the original RPMs (and eliminating > > the related but problematic checks from the RPMs??? scripts) so that > > our efix RPMs have been ???un-efix-ified??? and will install as > > expected when using ???yum upgrade???. To my knowledge no one???s > > published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4p8oUpuH$ From novosirj at rutgers.edu Thu Jan 16 21:31:57 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 16 Jan 2020 21:31:57 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- From scale at us.ibm.com Thu Jan 16 22:59:14 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 16 Jan 2020 17:59:14 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install gplbin without stopping GPFS by using the following step: Build gpfs.gplbin using mmbuildgpl --build-packge Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin package with rpm -i gpfs.gplbin*.rpm Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > From: Skylar Thompson > To: gpfsug-discuss at spectrumscale.org > Date: 01/16/2020 10:35 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > This reminds me that there is one more thing which drives the > convoluted process I described earlier??? > > > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efix > RPMS and in that order. IIRC, there were no RPM dependencies > connecting the efix RPMs to their base-version equivalents, so > there was nothing to signal YUM that installing the efix requires > that the base-version be installed first. > > > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that???s not the case for everyone.) > > > > -Paul > > > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > Sent: Wednesday, January 15, 2020 16:00 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > This message was sent by an external party. > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > Actually, you do not need to use --force option since efix RPMs > have incremental efix number in rpm name. > > > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > > > Regards, The Spectrum Scale (GPFS) team > > > > > ------------------------------------------------------------------------------------------------------------------ > > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > > > If your query concerns a potential software error in Spectrum > Scale (GPFS) and you have an IBM software maintenance contract > please contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > > > The forum is informally monitored as time permits and should not > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > single version of a > > > > From: Jonathan Buzzard mailto:jonathan.buzzard at strath.ac.uk>> > > To: "gpfsug-discuss at spectrumscale.org discuss at spectrumscale.org>" mailto:gpfsug-discuss at spectrumscale.org>> > > Date: 01/15/2020 02:09 PM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org discuss-bounces at spectrumscale.org> > > > > ________________________________ > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > Yum generally only wants there to be single version of any package (it > > > is trying to eliminate conflicting provides/depends so that all of the > > > packaging requirements are satisfied). So this alien packaging practice > > > of installing an efix version of a package over the top of the base > > > version is not compatible with yum. > > > > I would at this juncture note that IBM should be appending the efix > > number to the RPM so that for example > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > which would firstly make the problem go away, and second would allow one > > to know which version of GPFS you happen to have installed on a node > > without doing some sort of voodoo. > > > > > > > > The real issue for draconian sysadmins like us (whose systems must use > > > and obey yum) is that there are files (*liblum.so) which are provided by > > > the non-efix RPMS, but are not owned by the packages according to the > > > RPM database since they???re purposefully installed outside of RPM???s > > > tracking mechanism. > > > > > > > It worse than that because if you install the RPM directly yum/dnf then > > start bitching about the RPM database being modified outside of > > themselves and all sorts of useful information gets lost when you purge > > the package installation history to make the error go away. > > > > > We work around this by repackaging the three affected RPMS to include > > > the orphaned files from the original RPMs (and eliminating the related > > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > > have been ???un-efix-ified??? and will install as expected when > using ???yum > > > upgrade???. To my knowledge no one???s published a way to do this, so we > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > missing files. > > > > JAB. > > > > -- > > Jonathan A. Buzzard Tel: +44141-5483420 > > HPC System Administrator, ARCHIE-WeSt. > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 02:20:29 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 02:20:29 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <47921BA1-A20B-4B55-876D-A26C082496BE@rutgers.edu> Thank you for the reminder. I?ve received that nasty surprise myself, but just long ago enough to have forgotten it. Would love to see that fixed. > On Jan 16, 2020, at 10:32 AM, Skylar Thompson wrote: > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: >> This reminds me that there is one more thing which drives the convoluted process I described earlier??? >> >> Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. >> >> (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) >> >> -Paul >> >> From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale >> Sent: Wednesday, January 15, 2020 16:00 >> To: gpfsug main discussion list >> Cc: gpfsug-discuss-bounces at spectrumscale.org >> Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> >> >> This message was sent by an external party. >> >> >>>> I don't see any yum options which match rpm's '--force' option. >> Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. >> >> Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. >> >> Regards, The Spectrum Scale (GPFS) team >> >> ------------------------------------------------------------------------------------------------------------------ >> If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. >> >> If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. >> >> The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. >> >> [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a >> >> From: Jonathan Buzzard > >> To: "gpfsug-discuss at spectrumscale.org" > >> Date: 01/15/2020 02:09 PM >> Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> ________________________________ >> >> >> >> On 15/01/2020 18:30, Sanchez, Paul wrote: >>> Yum generally only wants there to be single version of any package (it >>> is trying to eliminate conflicting provides/depends so that all of the >>> packaging requirements are satisfied). So this alien packaging practice >>> of installing an efix version of a package over the top of the base >>> version is not compatible with yum. >> >> I would at this juncture note that IBM should be appending the efix >> number to the RPM so that for example >> >> gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 >> >> which would firstly make the problem go away, and second would allow one >> to know which version of GPFS you happen to have installed on a node >> without doing some sort of voodoo. >> >>> >>> The real issue for draconian sysadmins like us (whose systems must use >>> and obey yum) is that there are files (*liblum.so) which are provided by >>> the non-efix RPMS, but are not owned by the packages according to the >>> RPM database since they???re purposefully installed outside of RPM???s >>> tracking mechanism. >>> >> >> It worse than that because if you install the RPM directly yum/dnf then >> start bitching about the RPM database being modified outside of >> themselves and all sorts of useful information gets lost when you purge >> the package installation history to make the error go away. >> >>> We work around this by repackaging the three affected RPMS to include >>> the orphaned files from the original RPMs (and eliminating the related >>> but problematic checks from the RPMs??? scripts) so that our efix RPMs >>> have been ???un-efix-ified??? and will install as expected when using ???yum >>> upgrade???. To my knowledge no one???s published a way to do this, so we >>> all just have to figure this out and run rpmrebuild for ourselves. >>> >> >> IBM should be hanging their heads in shame if the replacement RPM is >> missing files. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From knop at us.ibm.com Fri Jan 17 15:35:19 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 15:35:19 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Fri Jan 17 15:42:57 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Fri, 17 Jan 2020 15:42:57 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <20200117154257.45ioc4ugw7dvuwym@utumno.gs.washington.edu> Thanks for the pointer! We're in the process of upgrading from 4.2.3-6 to 4.2.3-19 so I'll make a note that we should start setting that environment variable when we build gplbin. On Thu, Jan 16, 2020 at 05:59:14PM -0500, IBM Spectrum Scale wrote: > On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install > gplbin without stopping GPFS by using the following step: > > Build gpfs.gplbin using mmbuildgpl --build-packge > Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin > package with rpm -i gpfs.gplbin*.rpm > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > . > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > > > From: Skylar Thompson > > To: gpfsug-discuss at spectrumscale.org > > Date: 01/16/2020 10:35 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Another problem we've run into with automating GPFS installs/upgrades is > > that the gplbin (kernel module) packages have a post-install script that > > will unmount the filesystem *even if the package isn't for the running > > kernel*. We needed to write some custom reporting in our configuration > > management system to only install gplbin if GPFS was already stopped on > the > > node. > > > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > > This reminds me that there is one more thing which drives the > > convoluted process I described earlier??? > > > > > > Automation. Deployment solutions which use yum to build new hosts > > are often the place where one notices the problem. They would need > > to determine that they should install both the base-version and efix > > RPMS and in that order. IIRC, there were no RPM dependencies > > connecting the efix RPMs to their base-version equivalents, so > > there was nothing to signal YUM that installing the efix requires > > that the base-version be installed first. > > > > > > (Our particular case is worse than just this though, since we > > prohibit installing two versions/releases for the same (non-kernel) > > package name. But that???s not the case for everyone.) > > > > > > -Paul > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > > Sent: Wednesday, January 15, 2020 16:00 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > > > > This message was sent by an external party. > > > > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > > Actually, you do not need to use --force option since efix RPMs > > have incremental efix number in rpm name. > > > > > > Efix package provides update RPMs to be installed on top of > > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > > 0.4.1 is already installed on your system, "yum update" should work. > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > > > > ------------------------------------------------------------------------------------------------------------------ > > > If you feel that your question can benefit other users of Spectrum > > Scale (GPFS), then please post it to the public IBM developerWroks Forum > at > > https://www.ibm.com/developerworks/community/forums/html/forum? > > id=11111111-0000-0000-0000-000000000479. > > > > > > If your query concerns a potential software error in Spectrum > > Scale (GPFS) and you have an IBM software maintenance contract > > please contact 1-800-237-5511 in the United States or your local IBM > > Service Center in other countries. > > > > > > The forum is informally monitored as time permits and should not > > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > > single version of a > > > > > > From: Jonathan Buzzard > mailto:jonathan.buzzard at strath.ac.uk>> > > > To: "gpfsug-discuss at spectrumscale.org > discuss at spectrumscale.org>" > mailto:gpfsug-discuss at spectrumscale.org>> > > > Date: 01/15/2020 02:09 PM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum > ? > > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > discuss-bounces at spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > > Yum generally only wants there to be single version of any package > (it > > > > is trying to eliminate conflicting provides/depends so that all of > the > > > > packaging requirements are satisfied). So this alien packaging > practice > > > > of installing an efix version of a package over the top of the base > > > > version is not compatible with yum. > > > > > > I would at this juncture note that IBM should be appending the efix > > > number to the RPM so that for example > > > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > > > which would firstly make the problem go away, and second would allow > one > > > to know which version of GPFS you happen to have installed on a node > > > without doing some sort of voodoo. > > > > > > > > > > > The real issue for draconian sysadmins like us (whose systems must > use > > > > and obey yum) is that there are files (*liblum.so) which are > provided by > > > > the non-efix RPMS, but are not owned by the packages according to > the > > > > RPM database since they???re purposefully installed outside of > RPM???s > > > > tracking mechanism. > > > > > > > > > > It worse than that because if you install the RPM directly yum/dnf > then > > > start bitching about the RPM database being modified outside of > > > themselves and all sorts of useful information gets lost when you > purge > > > the package installation history to make the error go away. > > > > > > > We work around this by repackaging the three affected RPMS to > include > > > > the orphaned files from the original RPMs (and eliminating the > related > > > > but problematic checks from the RPMs??? scripts) so that our efix > RPMs > > > > have been ???un-efix-ified??? and will install as expected when > > using ???yum > > > > upgrade???. To my knowledge no one???s published a way to do this, > so we > > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > > missing files. > > > > > > JAB. > > > > > > -- > > > Jonathan A. Buzzard Tel: +44141-5483420 > > > HPC System Administrator, ARCHIE-WeSt. > > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department, System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- University of Washington School of Medicine > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From novosirj at rutgers.edu Fri Jan 17 15:55:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 15:55:54 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , Message-ID: That /is/ interesting. I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Jan 17, 2020, at 10:35, Felipe Knop wrote: ? Hi Ryan, Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. This, at least, seems to make sense, in terms of matching to the symptoms of the problem. We are still in internal debates on whether/how update our guidelines for gplbin generation ... Regards, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 ----- Original message ----- From: Ryan Novosielski Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug-discuss at spectrumscale.org" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Date: Thu, Jan 16, 2020 4:33 PM -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Fri Jan 17 16:36:01 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 16:36:01 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , , Message-ID: An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 16:58:58 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 16:58:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: [root at master config]# diff env.mcr env.mcr-1062.9.1 4,5c4,5 < #define LINUX_KERNEL_VERSION 31000999 < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 --- > #define LINUX_KERNEL_VERSION 31001062 > #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: > > Hi Ryan, > > My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. > > The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. > > Felipe > > ---- > Felipe Knop knop at us.ibm.com > GPFS Development and Security > IBM Systems > IBM Building 008 > 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > ----- Original message ----- > From: Ryan Novosielski > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests > Date: Fri, Jan 17, 2020 10:56 AM > > That /is/ interesting. > > I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >> >> ? >> Hi Ryan, >> >> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >> >> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >> >> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >> >> Regards, >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug-discuss at spectrumscale.org" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Thu, Jan 16, 2020 4:33 PM >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi Felipe, >> >> I either misunderstood support or convinced them to take further >> action. It at first looked like they were suggesting "mmbuildgpl fixed >> it: case closed" (I know they wanted to close the SalesForce case >> anyway, which would prevent communication on the issue). At this >> point, they've asked for a bunch more information. >> >> Support is asking similar questions re: the speculations, and I'll >> provide them with the relevant output ASAP, but I did confirm all of >> that, including that there were no stray mmfs26/tracedev kernel >> modules anywhere else in the relevant /lib/modules PATHs. In the >> original case, I built on a machine running 3.10.0-957.27.2, but >> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >> before, and rebuilding once the build system was running >> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >> you build against either the wrong kernel version or the wrong GPFS >> version, both will appear right in the filename of the gpfs.gplbin RPM >> you build. Mine is called: >> >> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >> >> Anyway, thanks for your response; I know you might not be >> following/working on this directly, but I figured the extra info might >> be of interest. >> >> On 1/16/20 8:41 AM, Felipe Knop wrote: >> > Hi Ryan, >> > >> > I'm aware of this ticket, and I understand that there has been >> > active communication with the service team on this problem. >> > >> > The crash itself, as you indicate, looks like a problem that has >> > been fixed: >> > >> > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >> > >> > The fact that the problem goes away when *mmbuildgpl* is issued >> > appears to point to some incompatibility with kernel levels and/or >> > Scale version levels. Just speculating, some possible areas may >> > be: >> > >> > >> > * The RPM might have been built on a version of Scale without the >> > fix * The RPM might have been built on a different (minor) version >> > of the kernel * Somehow the VM picked a "leftover" GPFS kernel >> > module, as opposed to the one included in gpfs.gplbin -- given >> > that mmfsd never complained about a missing GPL kernel module >> > >> > >> > Felipe >> > >> > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >> > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >> > (845) 433-9314 T/L 293-9314 >> > >> > >> > >> > >> > ----- Original message ----- From: Ryan Novosielski >> > Sent by: >> > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >> > list Cc: Subject: [EXTERNAL] >> > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >> > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >> > guests Date: Wed, Jan 15, 2020 4:11 PM >> > >> > Hi there, >> > >> > I know some of the Spectrum Scale developers look at this list. >> > I?m having a little trouble with support on this problem. >> > >> > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >> > guests with a portability layer that has been installed via >> > gpfs.gplbin RPMs that we built at our site and have used to >> > install GPFS all over our environment. We?ve not seen this problem >> > so far on any physical hosts, but have now experienced it on guests >> > running on number of our KVM hypervisors, across vendors and >> > firmware versions, etc. At one time I thought it was all happening >> > on systems using Mellanox virtual functions for Infiniband, but >> > we?ve now seen it on VMs without VFs. There may be an SELinux >> > interaction, but some of our hosts have it disabled outright, some >> > are Permissive, and some were working successfully with 5.0.2.x >> > GPFS. >> > >> > What I?ve been instructed to try to solve this problem has been to >> > run ?mmbuildgpl?, and it has solved the problem. I don?t consider >> > running "mmbuildgpl" a real solution, however. If RPMs are a >> > supported means of installation, it should work. Support told me >> > that they?d seen this solve the problem at another site as well. >> > >> > Does anyone have any more information about this problem/whether >> > there?s a fix in the pipeline, or something that can be done to >> > cause this problem that we could remedy? Is there an easy place to >> > see a list of eFixes to see if this has come up? I know it?s very >> > similar to a problem that happened I believe it was after 5.0.2.2 >> > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >> > >> > Below is a sample of the crash output: >> > >> > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >> > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >> > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >> > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >> > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >> > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >> > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >> > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >> > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >> > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >> > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >> > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >> > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >> > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >> > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >> > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >> > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >> > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >> > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >> > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >> > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >> > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >> > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >> > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >> > [] kfree+0x13c/0x140 [ 156.760749] RSP: >> > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >> > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >> > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >> > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >> > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >> > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >> > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >> > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >> > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >> > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >> > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >> > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >> > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >> > 156.775154] [] >> > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >> > [] >> > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >> > >> > >> [mmfs26] >> > [ 156.779378] [] >> > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >> cjjjP10ext_cred_t+0x46a/0x7e0 >> > >> > >> [mmfs26] >> > [ 156.781689] [] ? >> > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >> > >> > >> [mmfs26] >> > [ 156.783565] [] >> > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >> > >> > >> [mmfs26] >> > [ 156.786228] [] >> > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >> > >> > >> [mmfs26] >> > [ 156.788681] [] ? >> > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >> > [mmfs26] [ 156.790448] [] >> > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >> > >> > >> [mmfs26] >> > [ 156.793032] [] ? >> > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >> > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >> > [mmfslinux] [ 156.795838] [] ? >> > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >> c0 >> > >> > >> [mmfs26] >> > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >> > 156.798763] [] ? d_alloc+0x60/0x70 [ >> > 156.799700] [] lookup_real+0x23/0x60 [ >> > 156.800651] [] __lookup_hash+0x42/0x60 [ >> > 156.801675] [] lookup_slow+0x42/0xa7 [ >> > 156.802634] [] link_path_walk+0x80f/0x8b0 [ >> > 156.803666] [] path_lookupat+0x7a/0x8b0 [ >> > 156.804690] [] ? lru_cache_add+0xe/0x10 [ >> > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >> > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >> > 156.807817] [] filename_lookup+0x2b/0xc0 [ >> > 156.808834] [] user_path_at_empty+0x67/0xc0 [ >> > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >> > 156.811017] [] user_path_at+0x11/0x20 [ >> > 156.811983] [] vfs_fstatat+0x63/0xc0 [ >> > 156.812951] [] SYSC_newstat+0x2e/0x60 [ >> > 156.813931] [] ? trace_do_page_fault+0x56/0x150 >> > [ 156.815050] [] SyS_newstat+0xe/0x10 [ >> > 156.816010] [] system_call_fastpath+0x25/0x2a [ >> > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >> > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >> > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >> > 41 55 41 54 [ 156.822192] RIP [] >> > kfree+0x13c/0x140 [ 156.823180] RSP [ >> > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >> > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >> > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >> > 0xffffffff80000000-0xffffffffbfffffff) >> > >> > -- ____ || \\UTGERS, >> > |---------------------------*O*--------------------------- ||_// >> > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >> > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >> > Campus || \\ of NJ | Office of Advanced Research Computing - >> > MSB C630, Newark `' >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> > >> > >> > >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> >> - -- >> ____ >> || \\UTGERS, |----------------------*O*------------------------ >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >> `' >> -----BEGIN PGP SIGNATURE----- >> >> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >> =9rKb >> -----END PGP SIGNATURE----- >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Fri Jan 17 17:39:32 2020 From: ulmer at ulmer.org (Stephen Ulmer) Date: Fri, 17 Jan 2020 12:39:32 -0500 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> References: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Message-ID: Having a sanctioned way to compile targeting a version of the kernel that is installed ? but not running ? would be helpful in many circumstances. ? Stephen > On Jan 17, 2020, at 11:58 AM, Ryan Novosielski wrote: > > Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: > > [root at master config]# diff env.mcr env.mcr-1062.9.1 > 4,5c4,5 > < #define LINUX_KERNEL_VERSION 31000999 > < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 > --- >> #define LINUX_KERNEL_VERSION 31001062 >> #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 > > > ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. > > Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). > > Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: >> >> Hi Ryan, >> >> My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. >> >> The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: gpfsug main discussion list >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Fri, Jan 17, 2020 10:56 AM >> >> That /is/ interesting. >> >> I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? >> >> -- >> ____ >> || \\UTGERS, |---------------------------*O*--------------------------- >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark >> `' >> >>> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >>> >>> ? >>> Hi Ryan, >>> >>> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >>> >>> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >>> >>> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >>> >>> Regards, >>> >>> Felipe >>> >>> ---- >>> Felipe Knop knop at us.ibm.com >>> GPFS Development and Security >>> IBM Systems >>> IBM Building 008 >>> 2455 South Rd, Poughkeepsie, NY 12601 >>> (845) 433-9314 T/L 293-9314 >>> >>> >>> >>> ----- Original message ----- >>> From: Ryan Novosielski >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> To: "gpfsug-discuss at spectrumscale.org" >>> Cc: >>> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >>> Date: Thu, Jan 16, 2020 4:33 PM >>> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi Felipe, >>> >>> I either misunderstood support or convinced them to take further >>> action. It at first looked like they were suggesting "mmbuildgpl fixed >>> it: case closed" (I know they wanted to close the SalesForce case >>> anyway, which would prevent communication on the issue). At this >>> point, they've asked for a bunch more information. >>> >>> Support is asking similar questions re: the speculations, and I'll >>> provide them with the relevant output ASAP, but I did confirm all of >>> that, including that there were no stray mmfs26/tracedev kernel >>> modules anywhere else in the relevant /lib/modules PATHs. In the >>> original case, I built on a machine running 3.10.0-957.27.2, but >>> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >>> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >>> before, and rebuilding once the build system was running >>> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >>> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >>> you build against either the wrong kernel version or the wrong GPFS >>> version, both will appear right in the filename of the gpfs.gplbin RPM >>> you build. Mine is called: >>> >>> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >>> >>> Anyway, thanks for your response; I know you might not be >>> following/working on this directly, but I figured the extra info might >>> be of interest. >>> >>> On 1/16/20 8:41 AM, Felipe Knop wrote: >>>> Hi Ryan, >>>> >>>> I'm aware of this ticket, and I understand that there has been >>>> active communication with the service team on this problem. >>>> >>>> The crash itself, as you indicate, looks like a problem that has >>>> been fixed: >>>> >>>> https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >>> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >>>> >>>> The fact that the problem goes away when *mmbuildgpl* is issued >>>> appears to point to some incompatibility with kernel levels and/or >>>> Scale version levels. Just speculating, some possible areas may >>>> be: >>>> >>>> >>>> * The RPM might have been built on a version of Scale without the >>>> fix * The RPM might have been built on a different (minor) version >>>> of the kernel * Somehow the VM picked a "leftover" GPFS kernel >>>> module, as opposed to the one included in gpfs.gplbin -- given >>>> that mmfsd never complained about a missing GPL kernel module >>>> >>>> >>>> Felipe >>>> >>>> ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >>>> Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >>>> (845) 433-9314 T/L 293-9314 >>>> >>>> >>>> >>>> >>>> ----- Original message ----- From: Ryan Novosielski >>>> Sent by: >>>> gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >>>> list Cc: Subject: [EXTERNAL] >>>> [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >>>> Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >>>> guests Date: Wed, Jan 15, 2020 4:11 PM >>>> >>>> Hi there, >>>> >>>> I know some of the Spectrum Scale developers look at this list. >>>> I?m having a little trouble with support on this problem. >>>> >>>> We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >>>> guests with a portability layer that has been installed via >>>> gpfs.gplbin RPMs that we built at our site and have used to >>>> install GPFS all over our environment. We?ve not seen this problem >>>> so far on any physical hosts, but have now experienced it on guests >>>> running on number of our KVM hypervisors, across vendors and >>>> firmware versions, etc. At one time I thought it was all happening >>>> on systems using Mellanox virtual functions for Infiniband, but >>>> we?ve now seen it on VMs without VFs. There may be an SELinux >>>> interaction, but some of our hosts have it disabled outright, some >>>> are Permissive, and some were working successfully with 5.0.2.x >>>> GPFS. >>>> >>>> What I?ve been instructed to try to solve this problem has been to >>>> run ?mmbuildgpl?, and it has solved the problem. I don?t consider >>>> running "mmbuildgpl" a real solution, however. If RPMs are a >>>> supported means of installation, it should work. Support told me >>>> that they?d seen this solve the problem at another site as well. >>>> >>>> Does anyone have any more information about this problem/whether >>>> there?s a fix in the pipeline, or something that can be done to >>>> cause this problem that we could remedy? Is there an easy place to >>>> see a list of eFixes to see if this has come up? I know it?s very >>>> similar to a problem that happened I believe it was after 5.0.2.2 >>>> and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >>>> >>>> Below is a sample of the crash output: >>>> >>>> [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >>>> opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >>>> ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >>>> mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >>>> iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >>>> mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >>>> ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >>>> ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >>>> iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >>>> xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >>>> iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >>>> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >>>> ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >>>> virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >>>> mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >>>> pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >>>> crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >>>> devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >>>> virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >>>> 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >>>> ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >>>> Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >>>> 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >>>> ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >>>> [] kfree+0x13c/0x140 [ 156.760749] RSP: >>>> 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >>>> 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >>>> 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >>>> ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >>>> 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >>>> 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >>>> 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >>>> ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >>>> GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >>>> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >>>> 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >>>> 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >>>> 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >>>> 156.775154] [] >>>> cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >>>> [] >>>> _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >>> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >>>> >>>> >>> [mmfs26] >>>> [ 156.779378] [] >>>> _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >>> cjjjP10ext_cred_t+0x46a/0x7e0 >>>> >>>> >>> [mmfs26] >>>> [ 156.781689] [] ? >>>> _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >>>> >>>> >>> [mmfs26] >>>> [ 156.783565] [] >>>> _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >>> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >>>> >>>> >>> [mmfs26] >>>> [ 156.786228] [] >>>> _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >>> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >>>> >>>> >>> [mmfs26] >>>> [ 156.788681] [] ? >>>> _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >>>> [mmfs26] [ 156.790448] [] >>>> _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >>> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >>>> >>>> >>> [mmfs26] >>>> [ 156.793032] [] ? >>>> _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >>>> 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >>>> [mmfslinux] [ 156.795838] [] ? >>>> _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >>> c0 >>>> >>>> >>> [mmfs26] >>>> [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >>>> 156.798763] [] ? d_alloc+0x60/0x70 [ >>>> 156.799700] [] lookup_real+0x23/0x60 [ >>>> 156.800651] [] __lookup_hash+0x42/0x60 [ >>>> 156.801675] [] lookup_slow+0x42/0xa7 [ >>>> 156.802634] [] link_path_walk+0x80f/0x8b0 [ >>>> 156.803666] [] path_lookupat+0x7a/0x8b0 [ >>>> 156.804690] [] ? lru_cache_add+0xe/0x10 [ >>>> 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >>>> 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >>>> 156.807817] [] filename_lookup+0x2b/0xc0 [ >>>> 156.808834] [] user_path_at_empty+0x67/0xc0 [ >>>> 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >>>> 156.811017] [] user_path_at+0x11/0x20 [ >>>> 156.811983] [] vfs_fstatat+0x63/0xc0 [ >>>> 156.812951] [] SYSC_newstat+0x2e/0x60 [ >>>> 156.813931] [] ? trace_do_page_fault+0x56/0x150 >>>> [ 156.815050] [] SyS_newstat+0xe/0x10 [ >>>> 156.816010] [] system_call_fastpath+0x25/0x2a [ >>>> 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >>>> df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >>>> e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >>>> 41 55 41 54 [ 156.822192] RIP [] >>>> kfree+0x13c/0x140 [ 156.823180] RSP [ >>>> 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >>>> Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >>>> Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >>>> 0xffffffff80000000-0xffffffffbfffffff) >>>> >>>> -- ____ || \\UTGERS, >>>> |---------------------------*O*--------------------------- ||_// >>>> the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >>>> University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >>>> Campus || \\ of NJ | Office of Advanced Research Computing - >>>> MSB C630, Newark `' >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> - -- >>> ____ >>> || \\UTGERS, |----------------------*O*------------------------ >>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >>> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >>> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >>> `' >>> -----BEGIN PGP SIGNATURE----- >>> >>> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >>> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >>> =9rKb >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From heinrich.billich at id.ethz.ch Mon Jan 20 15:06:52 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:06:52 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: Thank you, this did work. I did install efix9 for 5.0.4.1 using yum, just with a plain ?yum update? after installing the base version. I placed efix and base rpms in different yum repos and did disable the efix-repo while installing the base version, and vice versa. Kind regards, Heiner From: on behalf of IBM Spectrum Scale Reply to: gpfsug main discussion list Date: Wednesday, 15 January 2020 at 22:00 To: gpfsug main discussion list Cc: "gpfsug-discuss-bounces at spectrumscale.org" Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:20:46 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:20:46 +0000 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Message-ID: Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:15:33 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:15:33 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hello Venkat, Thank you very much, upgrading to 5.0.4.1 did indeed fix the issue. AFM now compiles the list of pending changes in a few hours. Before we estimated >20days. We had to increase disk space in /var/mmfs/afm/ and /var/mmfs/tmp/ to allow AFM to store all intermediate file lists. The manual did recommend to provide much disk space in /var/mmfs/afm/ only, but some processes doing a resync placed lists in /var/mmfs/tmp/, too. Cheers, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Tuesday, 14 January 2020 at 17:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Jan 20 17:32:07 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 20 Jan 2020 23:02:07 +0530 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? In-Reply-To: References: Message-ID: While the recovery is running, reading the uncached files (evicted files) gets blocked until the recovery completes queueing the recovery operations. This is to make sure that recovery executes all the dependent operations first. For example, evicted file might have been renamed in the cache, but not yet replicated to home site and the fileset went into the recovery state. First recovery have to perform rename operation to the home site and then allow read operation on it. Read on the uncached files may get blocked if the cache state is in Recovery/NeedsResync/Unmounted/Dropped/Stopped states. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/20/2020 08:50 PM Subject: [EXTERNAL] [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=zwJSIhs7R020CQybqTb86CxBGIhtULCJo_QEggx05Y4&s=TGxHcd4HcDF0hv621ilqJ56r26Ah4rlmNM7PcJ3yLEA&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkr at lbl.gov Thu Jan 23 22:16:20 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Thu, 23 Jan 2020 14:16:20 -0800 Subject: [gpfsug-discuss] UPDATE Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thanks for your responses to the poll. We?re still working on a venue, but working towards: March 30 - New User Day (Tuesday) April 1&2 - Regular User Group Meeting (Wednesday & Thursday) Once it?s confirmed we?ll post something again. Best, Kristy. > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From agostino.funel at enea.it Mon Jan 27 10:26:55 2020 From: agostino.funel at enea.it (Agostino Funel) Date: Mon, 27 Jan 2020 11:26:55 +0100 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? Message-ID: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel From S.J.Thompson at bham.ac.uk Mon Jan 27 10:29:52 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 27 Jan 2020 10:29:52 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: References: Message-ID: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> 4.2.3 on Fix Central is called IBM Spectrum Scale, not GPFS. Try: https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.2.3&platform=Linux+64-bit,x86_64&function=all Simon ?On 27/01/2020, 10:27, "gpfsug-discuss-bounces at spectrumscale.org on behalf of agostino.funel at enea.it" wrote: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From stockf at us.ibm.com Mon Jan 27 11:33:11 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Mon, 27 Jan 2020 11:33:11 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> References: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk>, Message-ID: An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 29 13:05:30 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 29 Jan 2020 13:05:30 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Message-ID: Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Jan 30 14:43:54 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 30 Jan 2020 15:43:54 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? The following GUI > refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui > nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time > which may lead to locking or congestion issues, actually the logs show messages like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ??little-endian ?server takes a long > time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? (5.0.3.2 efix4). We run a mix > of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Thu Jan 30 15:13:06 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Thu, 30 Jan 2020 15:13:06 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: <6FA0119F-1067-4528-A4D8-54FA61F19BE0@id.ethz.ch> Hello Uli, Thank you. Yes, I noticed that some commands like 'ipmitool fru' or 'rinv' take long or very long on le systems- i've seen up to 7 minutes. I tried to reset the bmc with 'ipmitool mc reset cold' but this breaks the os access to ipmi, you need to unload/load the kernel modules in the right order to fix - or reboot. I also needed to restart goconserver to restore the console connection. Hence resetting the bmc is no real option for little-endian ESS server. I don't know yet whether the bmc reset fixed anything. So we'll wait for 5.3.5 Kind regards, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== ?On 30.01.20, 15:44, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Ulrich Sibiller" wrote: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rohwedder at de.ibm.com Thu Jan 30 15:31:32 2020 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Thu, 30 Jan 2020 16:31:32 +0100 Subject: [gpfsug-discuss] =?utf-8?q?gui=5Frefresh=5Ftask=5Ffailed_for_HW?= =?utf-8?q?=5FINVENTORY_with_two=09active_GUI_nodes?= In-Reply-To: References: Message-ID: Hello, The GUI tasks which are not daily tasks will start periodically at a random time. The exception are daily tasks which are defined at fixed start times. It seems this is the issue you are experiencing, as the HW_INVENTORY task only runs once a day adn starts at identical times on both GUI nodes. Tweaking the cache database is unfortunately not a workaround as the hard coded and fixed starting times will be reset for every GUI restart. I have created a task to address this issue in a future release. We could for example add a random delay to the daily tasks, or a fixed delay based on the number of GUI nodes that are active. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 162 4159920 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 29.01.2020 14:41 Subject: [EXTERNAL] [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=3j7GTkRFLANP-V9nMPiOuUX-2D3ybbNTEc64kU-OQAM&s=sR1v63lEVWuEZTBgspG3imB0MN_-7ggA6zrmyvqfCzE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 14272346.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ewahl at osc.edu Thu Jan 30 15:52:27 2020 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 30 Jan 2020 15:52:27 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: Interesting. We just deployed an ESS here and are running into a very similar problem with the gui refresh it appears. Takes my ppc64le's about 45 seconds to run rinv when they are idle. I had just opened a support case on this last evening. We're on ESS 5.3.4 as well. I will wait to see what support says. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Ulrich Sibiller Sent: Thursday, January 30, 2020 9:44 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? > The following GUI refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times > succeeds. We do run two gui nodes per cluster and I noted that both > servers seem run the HW_INVENTORY at the exact same time which may > lead to locking or congestion issues, actually the logs show messages > like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ?? > little-endian ?server takes a long time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an > issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ From janfrode at tanso.net Thu Jan 30 16:59:40 2020 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 30 Jan 2020 17:59:40 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: I *think* this was a known bug in the Power firmware included with 5.3.4, and that it was fixed in the FW860.70. Something hanging/crashing in IPMI. -jf tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward : > Interesting. We just deployed an ESS here and are running into a very > similar problem with the gui refresh it appears. Takes my ppc64le's about > 45 seconds to run rinv when they are idle. > I had just opened a support case on this last evening. We're on ESS > 5.3.4 as well. I will wait to see what support says. > > Ed Wahl > Ohio Supercomputer Center > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Ulrich Sibiller > Sent: Thursday, January 30, 2020 9:44 AM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY > with two active GUI nodes > > On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > > Hello, > > > > Can I change the times at which the GUI runs HW_INVENTORY and related > tasks? > > > > we frequently get messages like > > > > gui_refresh_task_failed GUI WARNING 12 hours > ago > > The following GUI refresh task(s) failed: HW_INVENTORY > > > > The tasks fail due to timeouts. Running the task manually most times > > succeeds. We do run two gui nodes per cluster and I noted that both > > servers seem run the HW_INVENTORY at the exact same time which may > > lead to locking or congestion issues, actually the logs show messages > > like > > > > EFSSA0194I Waiting for concurrent operation to complete. > > > > The gui calls ?rinv? on the xCat servers. Rinv for a single > > little-endian server takes a long time ? about 2-3 minutes , while it > finishes in about 15s for big-endian server. > > > > Hence the long runtime of rinv on little-endian systems may be an > > issue, too > > > > We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems > > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a > separate xCat/ems server for each type. The GUI nodes are ppc64le. > > > > We did see this issue with several gpfs version on the gui and with at > least two ESS/xCat versions. > > > > Just to be sure I did purge the Posgresql tables. > > > > I did try > > > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are > difficult. > > > I have seen the same on ppc64le. From time to time it recovers but then it > starts again. The timeouts are okay, it is the hardware. I haven opened a > call at IBM and they suggested upgrading to ESS 5.3.5 because of the new > firmwares which I am currently doing. I can dig out more details if you > want. > > Uli > -- > Science + Computing AG > Vorstandsvorsitzender/Chairman of the board of management: > Dr. Martin Matzke > Vorstand/Board of Management: > Matthias Schempp, Sabine Hohenstein > Vorsitzender des Aufsichtsrats/ > Chairman of the Supervisory Board: > Philippe Miltin > Aufsichtsrat/Supervisory Board: > Martin Wibbe, Ursula Morgenstern > Sitz/Registered Office: Tuebingen > Registergericht/Registration Court: Stuttgart Registernummer/Commercial > Register No.: HRB 382196 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Thu Jan 2 16:05:40 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 2 Jan 2020 16:05:40 +0000 Subject: [gpfsug-discuss] GPFS 5.0.4.1? Message-ID: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From richard.rupp at us.ibm.com Thu Jan 2 17:18:42 2020 From: richard.rupp at us.ibm.com (RICHARD RUPP) Date: Thu, 2 Jan 2020 12:18:42 -0500 Subject: [gpfsug-discuss] GPFS 5.0.4.1? In-Reply-To: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> References: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Message-ID: The Developer Edition can be access via https://www.ibm.com/us-en/marketplace/scale-out-file-and-object-storage It is a free edition, without support, and should not be used for production. There are licensed editions of 5.0.4.1 with support for production environments. 5.0.4.1 was released on 11/21/19 and it is available on Fix Central for IBM customers under maintenance. Regards, Richard Rupp, Sales Specialist, Phone: 1-347-510-6746 From: Ryan Novosielski To: gpfsug main discussion list Date: 01/02/2020 11:06 AM Subject: [EXTERNAL] [gpfsug-discuss] GPFS 5.0.4.1? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EXL-jEd1jmdzvOIhT87C7SIqmAS9uhVQ6J3kObct4OY&m=0YVtwpJq3PdmnToqO4d_GVOAxzzahyIi1xaFIROEs_w&s=Yv5UWi2D1yQpSOodfwfPq-4FC4iStKj_yXbE25Vrul4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From kkr at lbl.gov Mon Jan 6 23:41:28 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Mon, 6 Jan 2020 15:41:28 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thank you to the 18 wonderful people who filled out the survey. However, there are well more than 18 people at any given UG meeting. Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting Happy New Year. Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose wrote: > > Hello, > > It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. > > Best wishes to all in the new year. > > -Kristy > > > Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faryarag at in.ibm.com Tue Jan 7 10:47:39 2020 From: faryarag at in.ibm.com (Farida Yaragatti1) Date: Tue, 7 Jan 2020 16:17:39 +0530 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Message-ID: Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. ? The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: ? IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Tue Jan 7 11:58:13 2020 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Tue, 7 Jan 2020 11:58:13 +0000 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) In-Reply-To: References: Message-ID: <52b5b5557f3f44ce890fe141b670014b@huk-coburg.de> Hallo Farida, can you check your Links, it seems these doesnt work for the poeples outside the IBM network. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder, Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Farida Yaragatti1 Gesendet: Dienstag, 7. Januar 2020 11:48 An: gpfsug-discuss at spectrumscale.org Cc: Wesley Jones ; Mohsin A Inamdar ; Sumit Kumar43 ; Ricardo Daniel Zamora Ruvalcaba ; Rajan Mishra1 ; Pramod T Achutha ; Rezaul Islam ; Ravindra Sure Betreff: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. > The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: > IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Tue Jan 7 16:32:26 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 16:32:26 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> Message-ID: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: From novosirj at rutgers.edu Tue Jan 7 17:06:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Tue, 7 Jan 2020 17:06:54 +0000 Subject: [gpfsug-discuss] Snapshot migration of any kind? Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 We're in the process of figuring out how to rewrite our fileystems in order to take advantage of the 5.0.x variable subblock size enhancement. However, we keep generally 6 weeks of snapshots as a courtesy to the user community. I assume the answer is no, but is there any option for migrating snapshots, or barring that, any recommended reading for what you /can/ do with a snapshot beyond create/destroy? Thanks in advance. I'm having trouble coming up with any useful search terms. - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXhS6qQAKCRCZv6Bp0Ryx vicZAJsHI/z7DXc8EV+sqExhVwMPomoBSQCgyIHgS1Z7RlhQMYAySvDOINAUWPk= =CqPO -----END PGP SIGNATURE----- From kywang at us.ibm.com Tue Jan 7 17:11:35 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 12:11:35 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=m2_UDb09pxCtr3QQCy-6gDUzpw-o_zJQig_xI3C2_1c&m=5FdKJTgMapLheSzY_a5KkY9OQL5m9TwMBD0Bsdt6p58&s=t7Z10OpvkLnFZB5iiF9k8KGVE4R1yitIwUgFfye2tuU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B820169.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B859563.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B879604.gif Type: image/gif Size: 108 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 17:23:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 17:23:40 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: From kywang at us.ibm.com Tue Jan 7 19:13:13 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:13:13 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B974314.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B187982.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B270995.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B968493.gif Type: image/gif Size: 109 bytes Desc: not available URL: From carlz at us.ibm.com Tue Jan 7 19:28:46 2020 From: carlz at us.ibm.com (Carl Zetie - carlz@us.ibm.com) Date: Tue, 7 Jan 2020 19:28:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation Message-ID: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://www.surveygizmo.com/s3/5356255/ee853c3af96a The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 69557 bytes Desc: image001.png URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:39:22 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:39:22 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: From bbanister at jumptrading.com Tue Jan 7 19:40:43 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 7 Jan 2020 19:40:43 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation In-Reply-To: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> References: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> Message-ID: Hi Carl, Without going through the form completely, is there a short breakdown of what features are available to test in the 5.0.5 beta? -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie - carlz at us.ibm.com Sent: Tuesday, January 7, 2020 1:29 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation [EXTERNAL EMAIL] We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://urldefense.com/v3/__https://www.surveygizmo.com/s3/5356255/ee853c3af96a__;!!GSt_xZU7050wKg!-45kSCmNkDN_VaPV5a_MRw-agaDN2iav0KlVEKh7tgnWfA2U0zeE7zenEXkA3iFaVHxF$ The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com From kywang at us.ibm.com Tue Jan 7 19:50:31 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:50:31 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15076604.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15250423.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15764009.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169856.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169125.gif Type: image/gif Size: 110 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:51:19 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:51:19 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: How do I do that? (thnks!) Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:50 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we st]"Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have th From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 111 bytes Desc: image006.gif URL: From bipcuds at gmail.com Tue Jan 7 20:10:10 2020 From: bipcuds at gmail.com (Keith Ball) Date: Tue, 7 Jan 2020 15:10:10 -0500 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Message-ID: Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: cluster = literal_or(my.cluster.name) , groupBy = false filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 8 12:16:09 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 8 Jan 2020 12:16:09 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From knop at us.ibm.com Wed Jan 8 13:29:57 2020 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 8 Jan 2020 13:29:57 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: , <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu><770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu><4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu><8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image002.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image003.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image004.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image005.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 110 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image006.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 111 bytes Desc: not available URL: From heinrich.billich at id.ethz.ch Wed Jan 8 17:02:18 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 8 Jan 2020 17:02:18 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Message-ID: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From lgayne at us.ibm.com Wed Jan 8 18:15:47 2020 From: lgayne at us.ibm.com (Lyle Gayne) Date: Wed, 8 Jan 2020 18:15:47 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Thu Jan 9 19:27:30 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 19:27:30 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Message-ID: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rafael.Cezario at ibm.com Thu Jan 9 19:48:07 2020 From: Rafael.Cezario at ibm.com (Rafael Cezario) Date: Thu, 9 Jan 2020 16:48:07 -0300 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=efzA7AwXTDdK-0_uRBnvcy8-s5uewdL51EO34qmTe0I&m=xrmNBKF1K7yQh6tWtHfPemfaWt1wOT7LtKK83BFKE7g&s=HYvQUEzWuxhpP9FtEHHhY4ZV-UsGMJpGjccLEVgcPfk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Jan 9 20:24:36 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 9 Jan 2020 15:24:36 -0500 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmb ackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=369ydzzb59Q4zfz0T74pkucjHcKuR63z0UAf2aMqAz0&s=3za7Rn3o9V7oajWNFe-U8PvMH8hQLUyVVrHuFCind0g&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:00:59 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:00:59 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Hi Jonathan, Thanks for you kind reply. Indeed, I can always do that. Best, Razvan -- ?On 1/8/20, 7:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rp2927 at gsb.columbia.edu Thu Jan 9 21:19:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:19:40 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Thanks, I?ll set tonight?s run with that debug flag. Best, Razvan -- From: on behalf of IBM Spectrum Scale Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 3:24 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmbackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for "Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selec]"Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:38:02 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:38:02 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Hi Rafael, This looks awesomely promising, but I can?t find the info your refer here. My SCHEDLOGNAME points to /root/dsmsched.log but there is no file by that name in /root. I have the error and instrumentation logs (dsmerror.log and dsminstr.log) per their options, but not the scheduler. Could it be because I don?t run mmbackup via the TSM scheduler ?! (I run it as a cronjob, inside a little wrapper that takes care of preparing/deleting a snapshot for it). Must I run the scheduler to log the activity of the client? Thanks, Razvan -- From: on behalf of Rafael Cezario Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 2:48 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Fri Jan 10 09:00:53 2020 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Fri, 10 Jan 2020 10:00:53 +0100 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge In-Reply-To: References: Message-ID: Hello Keith, please check for more recent versions of the bridge here: https://github.com/IBM/ibm-spectrum-scale-bridge-for-grafana Also updating Grafana to some newer version could help, found some older reports while searching for the error message. HTH Norbert From: Keith Ball To: gpfsug-discuss at spectrumscale.org Date: 07.01.2020 21:10 Subject: [EXTERNAL] [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): ??? Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: ? cluster = literal_or(my.cluster.name) , groupBy = false ? filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, ?Keith _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=i4V0h7L9ElftZNfcuPIXmAHN2jl5TLcuyFLqtinu4j8&m=cqPhew27KzZmjx-Ai5Xk9NPLgCzZg6M2501wjjZ8ItY&s=jdSYaqQcp-DBBW6D0aax4E_qysldCTvWue3iMUemeuw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From jonathan.buzzard at strath.ac.uk Fri Jan 10 10:17:25 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 10 Jan 2020 10:17:25 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From rp2927 at gsb.columbia.edu Fri Jan 10 15:17:50 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Fri, 10 Jan 2020 15:17:50 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: <81809034-B68D-4E85-8CDD-50B2FE063755@gsb.columbia.edu> __. Yes, I've seen the recommendation in the docs, but failed to see an obvious advantage for my case. I have 4 separate backup jobs (on the same client), for as many filesets, for which I can set separate schedules. I guess (?) I could do the same with the TSM scheduler, but it was simpler this way in the beginning when I setup the system, and nothing pushed me to change it since... __ Razvan -- ?On 1/10/20, 5:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From vpuvvada at in.ibm.com Mon Jan 13 07:39:49 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 13 Jan 2020 13:09:49 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=WwGGO3WlGLmgMZX-tb_xjLEk0paAJ_Tekt6NNrxJgPM&s=_oss6YKaJwm5PEi1xqqpwxOstqR0Pqw6hdhOwZ3gsAw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Jan 13 09:11:39 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 13 Jan 2020 10:11:39 +0100 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Mon Jan 13 11:59:11 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 13 Jan 2020 11:59:11 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Mon Jan 13 16:02:45 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Mon, 13 Jan 2020 16:02:45 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5709E6AE-5DD1-46A1-A1B7-C24BF6FFAF84@gsb.columbia.edu> Thanks Uli, I ran the backup with the flag mentioned by {the GPFS team} (thanks again, guys!!) and found the internal list files -- all super fine. I plan to keep that flag in place for a while, to have that info when I might need it (the large files that kept being backed up, and I wanted to trace, just disappeared... __ ) Razvan -- ?On 1/13/20, 4:11 AM, "Ulrich Sibiller" wrote: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From neil.wilson at metoffice.gov.uk Tue Jan 14 15:27:54 2020 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Tue, 14 Jan 2020 15:27:54 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. Message-ID: Hi All, We are occasionally seeing an issue where an mmapplypolicy list job gets stuck , all its doing is generating a listing from a fileset. The problem occurs intermittently and doesn't seem to show any particular pattern ( i.e. not always on the same fileset) The policy job shows the usual output but then outputs the following until the process is killed. [I] 2020-01-08 at 03:05:30.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:05:45.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:00.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:15.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:30.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:45.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:00.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:15.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:30.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:45.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:00.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:15.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:30.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:45.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:00.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:15.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:30.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:45.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:00.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:15.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:30.479 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:45.480 Directory entries scanned: 0. [I] 2020-01-08 at 03:11:00.481 Directory entries scanned: 0. Have any of you come across an issue like this before? Kind regards Neil Neil Wilson? Senior IT Practitioner Storage, Virtualisation and Mainframe Team?? IT Services Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom From vpuvvada at in.ibm.com Tue Jan 14 16:50:17 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 14 Jan 2020 22:20:17 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=20ckIS70wRlogBxSX5WD9bOfsKUGUmKmBAWo7o3UIxQ&s=vGFKxKzbzDKaO343APy97QWJPnsfSSxhWz8qCVCnVqo&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Tue Jan 14 18:35:05 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Tue, 14 Jan 2020 18:35:05 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:21:12 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:21:12 +0000 Subject: [gpfsug-discuss] London User Group Message-ID: Hi All, Just a date for your diary, the UK/WW/London user group will be taking place 13th/14th May. In addition to this, we?re also running an introductory day on 12th May for those recently acquainted with Spectrum Scale. Please mark the dates in your diary! If you have any topics you would like to hear about in London (or any of the other WW user groups) please let me know. Please also take some time to think about if you could provide a site-update or user talk for the event. The feedback we get is that people want to hear more of these, but we can only do this if you are prepared to volunteer a talk. Everyone has something to say about their site deployment, maybe you want to talk about what you are doing with Scale, how you found deployment, or the challenges you face. Finally, as in the past few years, we are looking for sponsors of the UK event, this funds our evening social/networking event which has been a great success over the past few years as he group has grown in size. I will be contacting companies who have supported us in the past, but please also drop me an email if you are interested in sponsoring the group and I will ensure I share the details of the sponsorship offering with you ? when we advertise sponsorship, it will be offered on a first come, first served basis. Thanks Simon (UK/group chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:25:20 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:25:20 +0000 Subject: [gpfsug-discuss] #spectrum-discover Slack channel Message-ID: We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 15 14:55:53 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 15 Jan 2020 14:55:53 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? Message-ID: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Wed Jan 15 18:30:26 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 18:30:26 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Message-ID: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Yum generally only wants there to be single version of any package (it is trying to eliminate conflicting provides/depends so that all of the packaging requirements are satisfied). So this alien packaging practice of installing an efix version of a package over the top of the base version is not compatible with yum. The real issue for draconian sysadmins like us (whose systems must use and obey yum) is that there are files (*liblum.so) which are provided by the non-efix RPMS, but are not owned by the packages according to the RPM database since they?re purposefully installed outside of RPM?s tracking mechanism. We work around this by repackaging the three affected RPMS to include the orphaned files from the original RPMs (and eliminating the related but problematic checks from the RPMs? scripts) so that our efix RPMs have been ?un-efix-ified? and will install as expected when using ?yum upgrade?. To my knowledge no one?s published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. IBM isn?t the only vendor who is ?bad at packaging? from a sysadmin?s point of view, but they are the only one which owns RedHat (who are the de facto masters of RPM/YUM/DNF packaging) so this should probably get better one day. ? Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Billich Heinrich Rainer (ID SD) Sent: Wednesday, January 15, 2020 09:56 To: gpfsug main discussion list Subject: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 15 19:10:20 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 15 Jan 2020 19:10:20 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Message-ID: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From kkr at lbl.gov Wed Jan 15 18:20:04 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Wed, 15 Jan 2020 10:20:04 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Now there are 27 wonderful people who have completed the poll. I will close it today, EOB. Please take the 2 minutes to fill it out before it closes. https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Wed Jan 15 20:59:33 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 15:59:33 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=d-mEUJTkUy0f2Cth1wflA_xI_HiCKrrKZ_-SAjf2z5Q&s=wkv8CcIBgPcGbuG-aIGgcWZoZqzb6FvvjmKX-V728wE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From novosirj at rutgers.edu Wed Jan 15 21:10:59 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Wed, 15 Jan 2020 21:10:59 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Message-ID: Hi there, I know some of the Spectrum Scale developers look at this list. I?m having a little trouble with support on this problem. We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM guests with a portability layer that has been installed via gpfs.gplbin RPMs that we built at our site and have used to install GPFS all over our environment. We?ve not seen this problem so far on any physical hosts, but have now experienced it on guests running on number of our KVM hypervisors, across vendors and firmware versions, etc. At one time I thought it was all happening on systems using Mellanox virtual functions for Infiniband, but we?ve now seen it on VMs without VFs. There may be an SELinux interaction, but some of our hosts have it disabled outright, some are Permissive, and some were working successfully with 5.0.2.x GPFS. What I?ve been instructed to try to solve this problem has been to run ?mmbuildgpl?, and it has solved the problem. I don?t consider running "mmbuildgpl" a real solution, however. If RPMs are a supported means of installation, it should work. Support told me that they?d seen this solve the problem at another site as well. Does anyone have any more information about this problem/whether there?s a fix in the pipeline, or something that can be done to cause this problem that we could remedy? Is there an easy place to see a list of eFixes to see if this has come up? I know it?s very similar to a problem that happened I believe it was after 5.0.2.2 and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. Below is a sample of the crash output: [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] [] kfree+0x13c/0x140 [ 156.760749] RSP: 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ 156.775154] [] cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] [] _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvPP10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 [mmfs26] [ 156.779378] [] _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_PcjjjP10ext_cred_t+0x46a/0x7e0 [mmfs26] [ 156.781689] [] ? _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 [mmfs26] [ 156.783565] [] _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 [mmfs26] [ 156.786228] [] _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7FilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 [mmfs26] [ 156.788681] [] ? _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 [mmfs26] [ 156.790448] [] _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVattr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 [mmfs26] [ 156.793032] [] ? _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 [mmfslinux] [ 156.795838] [] ? _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6c0 [mmfs26] [ 156.797753] [] ? __d_alloc+0x122/0x180 [ 156.798763] [] ? d_alloc+0x60/0x70 [ 156.799700] [] lookup_real+0x23/0x60 [ 156.800651] [] __lookup_hash+0x42/0x60 [ 156.801675] [] lookup_slow+0x42/0xa7 [ 156.802634] [] link_path_walk+0x80f/0x8b0 [ 156.803666] [] path_lookupat+0x7a/0x8b0 [ 156.804690] [] ? lru_cache_add+0xe/0x10 [ 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ 156.806766] [] ? getname_flags+0x4f/0x1a0 [ 156.807817] [] filename_lookup+0x2b/0xc0 [ 156.808834] [] user_path_at_empty+0x67/0xc0 [ 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ 156.811017] [] user_path_at+0x11/0x20 [ 156.811983] [] vfs_fstatat+0x63/0xc0 [ 156.812951] [] SYSC_newstat+0x2e/0x60 [ 156.813931] [] ? trace_do_page_fault+0x56/0x150 [ 156.815050] [] SyS_newstat+0xe/0x10 [ 156.816010] [] system_call_fastpath+0x25/0x2a [ 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 [ 156.822192] RIP [] kfree+0x13c/0x140 [ 156.823180] RSP [ 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From Paul.Sanchez at deshaw.com Wed Jan 15 22:35:23 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 22:35:23 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: This reminds me that there is one more thing which drives the convoluted process I described earlier? Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that?s not the case for everyone.) -Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: Wednesday, January 15, 2020 16:00 To: gpfsug main discussion list Cc: gpfsug-discuss-bounces at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From scale at us.ibm.com Wed Jan 15 23:50:50 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 18:50:50 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: When requesting efix, you can inform the service personnel that you need efix RPMs which don't have dependencies on the base-version. Our service team should be able to provide the appropriate efix RPMs that meet your needs. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/15/2020 05:35:23 PM: > From: "Sanchez, Paul" > To: gpfsug main discussion list > Cc: "gpfsug-discuss-bounces at spectrumscale.org" bounces at spectrumscale.org> > Date: 01/15/2020 05:34 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > This reminds me that there is one more thing which drives the > convoluted process I described earlier? > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efixRPMS and > in that order. IIRC, there were no RPM dependencies connecting the > efix RPMs to their base-version equivalents, so there was nothing to > signal YUM that installing the efix requires that the base-version > be installed first. > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that?s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > This message was sent by an external party. > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have > incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please > contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > The forum is informally monitored as time permits and should not be > used for priority messages to the Spectrum Scale (GPFS) team. > > [image removed] Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/ > 01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there > to be single version of a > > From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they?re purposefully installed outside of RPM?s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs? scripts) so that our efix RPMs > > have been ?un-efix-ified? and will install as expected when using ?yum > > upgrade?. To my knowledge no one?s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=aSasL0r- > NxIT9nDrkoQO6rcyV88VUM_oc6mYssN-_Ng&s=4- > wB8cR24x2P7Rpn_14fIXuwxCvvqwne7xcIp85dZoI&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From frankli at us.ibm.com Thu Jan 16 04:47:43 2020 From: frankli at us.ibm.com (Frank N Lee) Date: Wed, 15 Jan 2020 22:47:43 -0600 Subject: [gpfsug-discuss] #spectrum-discover Slack channel In-Reply-To: References: Message-ID: Simon, Thanks for launching this Slack channel! Copying some of my colleagues who works with Discover. Frank Frank Lee, PhD IBM Systems Group 314-482-5329 | @drfranknlee From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 01/14/2020 02:25 PM Subject: [EXTERNAL] [gpfsug-discuss] #spectrum-discover Slack channel Sent by: gpfsug-discuss-bounces at spectrumscale.org We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HIs14G9Qcs5MqpsAFL5E0TH5hqFD-KbquYdQ_mTmTnI&m=ncahEw2s7R3QIxk7C4IZw2JyOd4_8dFtsAueY6L6dF8&s=fTL2YhTgik5-QpcxEHpoJLO5A9FfOF2ZyNK09_Zxfbc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From phgrau at zedat.fu-berlin.de Thu Jan 16 10:51:43 2020 From: phgrau at zedat.fu-berlin.de (Philipp Grau) Date: Thu, 16 Jan 2020 11:51:43 +0100 Subject: [gpfsug-discuss] Welcome to the "gpfsug-discuss" mailing list In-Reply-To: References: Message-ID: <20200116105143.GA278757@CIS.FU-Berlin.DE> Hello, as requested: * gpfsug-discuss-request at spectrumscale.org [15.01.20 13:40]: > Please introduce yourself to the members with your first post. I'm Philipp from Berlin, Germany. The IT department of the "Freie Universit?t Berlin" is working place. We have a DDN-System with some PB of storage, and GPFS nodes for exporting the space. The use case is "scientific storage", reseach data and stuff (no home or group shares). Regards, Philipp From knop at us.ibm.com Thu Jan 16 13:41:58 2020 From: knop at us.ibm.com (Felipe Knop) Date: Thu, 16 Jan 2020 13:41:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Thu Jan 16 15:32:27 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Thu, 16 Jan 2020 15:32:27 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > To: "gpfsug-discuss at spectrumscale.org" > > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they???re purposefully installed outside of RPM???s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > have been ???un-efix-ified??? and will install as expected when using ???yum > > upgrade???. To my knowledge no one???s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From bbanister at jumptrading.com Thu Jan 16 17:12:04 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 16 Jan 2020 17:12:04 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: We actually add an ExecStartPre directive override (e.g. /etc/systemd/system/gpfs.service.d/gpfs.service.conf) to the gpfs.service [Service] section that points to a simple script that does a check of the GPFS RPMs installed on the system and updates them to what our config management specifies should be installed (a simple txt file in /etc/sysconfig namespace), which ensures that GPFS RPMs are updated before GPFS is started, while GPFS is still down. Works very well for us. The script also does some other checks and updates too, such as adding the node into the right GPFS cluster if needed. Hope that helps, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: Thursday, January 16, 2020 9:32 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? [EXTERNAL EMAIL] Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit > installing two versions/releases for the same (non-kernel) package > name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum > Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ---------------------------------------------------------------------- > -------------------------------------------- > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://urldefense.com/v3/__https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4kUZwbFD$ . > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan > Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul > wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > > To: > "gpfsug-discuss at spectrumscale.org org>" > org>> > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: > gpfsug-discuss-bounces at spectrumscale.org @spectrumscale.org> > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package > > (it is trying to eliminate conflicting provides/depends so that all > > of the packaging requirements are satisfied). So this alien > > packaging practice of installing an efix version of a package over > > the top of the base version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow > one to know which version of GPFS you happen to have installed on a > node without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must > > use and obey yum) is that there are files (*liblum.so) which are > > provided by the non-efix RPMS, but are not owned by the packages > > according to the RPM database since they???re purposefully installed > > outside of RPM???s tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf > then start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you > purge the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to > > include the orphaned files from the original RPMs (and eliminating > > the related but problematic checks from the RPMs??? scripts) so that > > our efix RPMs have been ???un-efix-ified??? and will install as > > expected when using ???yum upgrade???. To my knowledge no one???s > > published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4p8oUpuH$ From novosirj at rutgers.edu Thu Jan 16 21:31:57 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 16 Jan 2020 21:31:57 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- From scale at us.ibm.com Thu Jan 16 22:59:14 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 16 Jan 2020 17:59:14 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install gplbin without stopping GPFS by using the following step: Build gpfs.gplbin using mmbuildgpl --build-packge Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin package with rpm -i gpfs.gplbin*.rpm Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > From: Skylar Thompson > To: gpfsug-discuss at spectrumscale.org > Date: 01/16/2020 10:35 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > This reminds me that there is one more thing which drives the > convoluted process I described earlier??? > > > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efix > RPMS and in that order. IIRC, there were no RPM dependencies > connecting the efix RPMs to their base-version equivalents, so > there was nothing to signal YUM that installing the efix requires > that the base-version be installed first. > > > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that???s not the case for everyone.) > > > > -Paul > > > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > Sent: Wednesday, January 15, 2020 16:00 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > This message was sent by an external party. > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > Actually, you do not need to use --force option since efix RPMs > have incremental efix number in rpm name. > > > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > > > Regards, The Spectrum Scale (GPFS) team > > > > > ------------------------------------------------------------------------------------------------------------------ > > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > > > If your query concerns a potential software error in Spectrum > Scale (GPFS) and you have an IBM software maintenance contract > please contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > > > The forum is informally monitored as time permits and should not > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > single version of a > > > > From: Jonathan Buzzard mailto:jonathan.buzzard at strath.ac.uk>> > > To: "gpfsug-discuss at spectrumscale.org discuss at spectrumscale.org>" mailto:gpfsug-discuss at spectrumscale.org>> > > Date: 01/15/2020 02:09 PM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org discuss-bounces at spectrumscale.org> > > > > ________________________________ > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > Yum generally only wants there to be single version of any package (it > > > is trying to eliminate conflicting provides/depends so that all of the > > > packaging requirements are satisfied). So this alien packaging practice > > > of installing an efix version of a package over the top of the base > > > version is not compatible with yum. > > > > I would at this juncture note that IBM should be appending the efix > > number to the RPM so that for example > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > which would firstly make the problem go away, and second would allow one > > to know which version of GPFS you happen to have installed on a node > > without doing some sort of voodoo. > > > > > > > > The real issue for draconian sysadmins like us (whose systems must use > > > and obey yum) is that there are files (*liblum.so) which are provided by > > > the non-efix RPMS, but are not owned by the packages according to the > > > RPM database since they???re purposefully installed outside of RPM???s > > > tracking mechanism. > > > > > > > It worse than that because if you install the RPM directly yum/dnf then > > start bitching about the RPM database being modified outside of > > themselves and all sorts of useful information gets lost when you purge > > the package installation history to make the error go away. > > > > > We work around this by repackaging the three affected RPMS to include > > > the orphaned files from the original RPMs (and eliminating the related > > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > > have been ???un-efix-ified??? and will install as expected when > using ???yum > > > upgrade???. To my knowledge no one???s published a way to do this, so we > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > missing files. > > > > JAB. > > > > -- > > Jonathan A. Buzzard Tel: +44141-5483420 > > HPC System Administrator, ARCHIE-WeSt. > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 02:20:29 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 02:20:29 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <47921BA1-A20B-4B55-876D-A26C082496BE@rutgers.edu> Thank you for the reminder. I?ve received that nasty surprise myself, but just long ago enough to have forgotten it. Would love to see that fixed. > On Jan 16, 2020, at 10:32 AM, Skylar Thompson wrote: > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: >> This reminds me that there is one more thing which drives the convoluted process I described earlier??? >> >> Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. >> >> (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) >> >> -Paul >> >> From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale >> Sent: Wednesday, January 15, 2020 16:00 >> To: gpfsug main discussion list >> Cc: gpfsug-discuss-bounces at spectrumscale.org >> Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> >> >> This message was sent by an external party. >> >> >>>> I don't see any yum options which match rpm's '--force' option. >> Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. >> >> Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. >> >> Regards, The Spectrum Scale (GPFS) team >> >> ------------------------------------------------------------------------------------------------------------------ >> If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. >> >> If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. >> >> The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. >> >> [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a >> >> From: Jonathan Buzzard > >> To: "gpfsug-discuss at spectrumscale.org" > >> Date: 01/15/2020 02:09 PM >> Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> ________________________________ >> >> >> >> On 15/01/2020 18:30, Sanchez, Paul wrote: >>> Yum generally only wants there to be single version of any package (it >>> is trying to eliminate conflicting provides/depends so that all of the >>> packaging requirements are satisfied). So this alien packaging practice >>> of installing an efix version of a package over the top of the base >>> version is not compatible with yum. >> >> I would at this juncture note that IBM should be appending the efix >> number to the RPM so that for example >> >> gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 >> >> which would firstly make the problem go away, and second would allow one >> to know which version of GPFS you happen to have installed on a node >> without doing some sort of voodoo. >> >>> >>> The real issue for draconian sysadmins like us (whose systems must use >>> and obey yum) is that there are files (*liblum.so) which are provided by >>> the non-efix RPMS, but are not owned by the packages according to the >>> RPM database since they???re purposefully installed outside of RPM???s >>> tracking mechanism. >>> >> >> It worse than that because if you install the RPM directly yum/dnf then >> start bitching about the RPM database being modified outside of >> themselves and all sorts of useful information gets lost when you purge >> the package installation history to make the error go away. >> >>> We work around this by repackaging the three affected RPMS to include >>> the orphaned files from the original RPMs (and eliminating the related >>> but problematic checks from the RPMs??? scripts) so that our efix RPMs >>> have been ???un-efix-ified??? and will install as expected when using ???yum >>> upgrade???. To my knowledge no one???s published a way to do this, so we >>> all just have to figure this out and run rpmrebuild for ourselves. >>> >> >> IBM should be hanging their heads in shame if the replacement RPM is >> missing files. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From knop at us.ibm.com Fri Jan 17 15:35:19 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 15:35:19 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Fri Jan 17 15:42:57 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Fri, 17 Jan 2020 15:42:57 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <20200117154257.45ioc4ugw7dvuwym@utumno.gs.washington.edu> Thanks for the pointer! We're in the process of upgrading from 4.2.3-6 to 4.2.3-19 so I'll make a note that we should start setting that environment variable when we build gplbin. On Thu, Jan 16, 2020 at 05:59:14PM -0500, IBM Spectrum Scale wrote: > On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install > gplbin without stopping GPFS by using the following step: > > Build gpfs.gplbin using mmbuildgpl --build-packge > Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin > package with rpm -i gpfs.gplbin*.rpm > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > . > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > > > From: Skylar Thompson > > To: gpfsug-discuss at spectrumscale.org > > Date: 01/16/2020 10:35 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Another problem we've run into with automating GPFS installs/upgrades is > > that the gplbin (kernel module) packages have a post-install script that > > will unmount the filesystem *even if the package isn't for the running > > kernel*. We needed to write some custom reporting in our configuration > > management system to only install gplbin if GPFS was already stopped on > the > > node. > > > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > > This reminds me that there is one more thing which drives the > > convoluted process I described earlier??? > > > > > > Automation. Deployment solutions which use yum to build new hosts > > are often the place where one notices the problem. They would need > > to determine that they should install both the base-version and efix > > RPMS and in that order. IIRC, there were no RPM dependencies > > connecting the efix RPMs to their base-version equivalents, so > > there was nothing to signal YUM that installing the efix requires > > that the base-version be installed first. > > > > > > (Our particular case is worse than just this though, since we > > prohibit installing two versions/releases for the same (non-kernel) > > package name. But that???s not the case for everyone.) > > > > > > -Paul > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > > Sent: Wednesday, January 15, 2020 16:00 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > > > > This message was sent by an external party. > > > > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > > Actually, you do not need to use --force option since efix RPMs > > have incremental efix number in rpm name. > > > > > > Efix package provides update RPMs to be installed on top of > > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > > 0.4.1 is already installed on your system, "yum update" should work. > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > > > > ------------------------------------------------------------------------------------------------------------------ > > > If you feel that your question can benefit other users of Spectrum > > Scale (GPFS), then please post it to the public IBM developerWroks Forum > at > > https://www.ibm.com/developerworks/community/forums/html/forum? > > id=11111111-0000-0000-0000-000000000479. > > > > > > If your query concerns a potential software error in Spectrum > > Scale (GPFS) and you have an IBM software maintenance contract > > please contact 1-800-237-5511 in the United States or your local IBM > > Service Center in other countries. > > > > > > The forum is informally monitored as time permits and should not > > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > > single version of a > > > > > > From: Jonathan Buzzard > mailto:jonathan.buzzard at strath.ac.uk>> > > > To: "gpfsug-discuss at spectrumscale.org > discuss at spectrumscale.org>" > mailto:gpfsug-discuss at spectrumscale.org>> > > > Date: 01/15/2020 02:09 PM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum > ? > > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > discuss-bounces at spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > > Yum generally only wants there to be single version of any package > (it > > > > is trying to eliminate conflicting provides/depends so that all of > the > > > > packaging requirements are satisfied). So this alien packaging > practice > > > > of installing an efix version of a package over the top of the base > > > > version is not compatible with yum. > > > > > > I would at this juncture note that IBM should be appending the efix > > > number to the RPM so that for example > > > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > > > which would firstly make the problem go away, and second would allow > one > > > to know which version of GPFS you happen to have installed on a node > > > without doing some sort of voodoo. > > > > > > > > > > > The real issue for draconian sysadmins like us (whose systems must > use > > > > and obey yum) is that there are files (*liblum.so) which are > provided by > > > > the non-efix RPMS, but are not owned by the packages according to > the > > > > RPM database since they???re purposefully installed outside of > RPM???s > > > > tracking mechanism. > > > > > > > > > > It worse than that because if you install the RPM directly yum/dnf > then > > > start bitching about the RPM database being modified outside of > > > themselves and all sorts of useful information gets lost when you > purge > > > the package installation history to make the error go away. > > > > > > > We work around this by repackaging the three affected RPMS to > include > > > > the orphaned files from the original RPMs (and eliminating the > related > > > > but problematic checks from the RPMs??? scripts) so that our efix > RPMs > > > > have been ???un-efix-ified??? and will install as expected when > > using ???yum > > > > upgrade???. To my knowledge no one???s published a way to do this, > so we > > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > > missing files. > > > > > > JAB. > > > > > > -- > > > Jonathan A. Buzzard Tel: +44141-5483420 > > > HPC System Administrator, ARCHIE-WeSt. > > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department, System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- University of Washington School of Medicine > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From novosirj at rutgers.edu Fri Jan 17 15:55:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 15:55:54 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , Message-ID: That /is/ interesting. I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Jan 17, 2020, at 10:35, Felipe Knop wrote: ? Hi Ryan, Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. This, at least, seems to make sense, in terms of matching to the symptoms of the problem. We are still in internal debates on whether/how update our guidelines for gplbin generation ... Regards, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 ----- Original message ----- From: Ryan Novosielski Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug-discuss at spectrumscale.org" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Date: Thu, Jan 16, 2020 4:33 PM -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Fri Jan 17 16:36:01 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 16:36:01 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , , Message-ID: An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 16:58:58 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 16:58:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: [root at master config]# diff env.mcr env.mcr-1062.9.1 4,5c4,5 < #define LINUX_KERNEL_VERSION 31000999 < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 --- > #define LINUX_KERNEL_VERSION 31001062 > #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: > > Hi Ryan, > > My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. > > The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. > > Felipe > > ---- > Felipe Knop knop at us.ibm.com > GPFS Development and Security > IBM Systems > IBM Building 008 > 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > ----- Original message ----- > From: Ryan Novosielski > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests > Date: Fri, Jan 17, 2020 10:56 AM > > That /is/ interesting. > > I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >> >> ? >> Hi Ryan, >> >> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >> >> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >> >> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >> >> Regards, >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug-discuss at spectrumscale.org" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Thu, Jan 16, 2020 4:33 PM >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi Felipe, >> >> I either misunderstood support or convinced them to take further >> action. It at first looked like they were suggesting "mmbuildgpl fixed >> it: case closed" (I know they wanted to close the SalesForce case >> anyway, which would prevent communication on the issue). At this >> point, they've asked for a bunch more information. >> >> Support is asking similar questions re: the speculations, and I'll >> provide them with the relevant output ASAP, but I did confirm all of >> that, including that there were no stray mmfs26/tracedev kernel >> modules anywhere else in the relevant /lib/modules PATHs. In the >> original case, I built on a machine running 3.10.0-957.27.2, but >> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >> before, and rebuilding once the build system was running >> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >> you build against either the wrong kernel version or the wrong GPFS >> version, both will appear right in the filename of the gpfs.gplbin RPM >> you build. Mine is called: >> >> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >> >> Anyway, thanks for your response; I know you might not be >> following/working on this directly, but I figured the extra info might >> be of interest. >> >> On 1/16/20 8:41 AM, Felipe Knop wrote: >> > Hi Ryan, >> > >> > I'm aware of this ticket, and I understand that there has been >> > active communication with the service team on this problem. >> > >> > The crash itself, as you indicate, looks like a problem that has >> > been fixed: >> > >> > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >> > >> > The fact that the problem goes away when *mmbuildgpl* is issued >> > appears to point to some incompatibility with kernel levels and/or >> > Scale version levels. Just speculating, some possible areas may >> > be: >> > >> > >> > * The RPM might have been built on a version of Scale without the >> > fix * The RPM might have been built on a different (minor) version >> > of the kernel * Somehow the VM picked a "leftover" GPFS kernel >> > module, as opposed to the one included in gpfs.gplbin -- given >> > that mmfsd never complained about a missing GPL kernel module >> > >> > >> > Felipe >> > >> > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >> > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >> > (845) 433-9314 T/L 293-9314 >> > >> > >> > >> > >> > ----- Original message ----- From: Ryan Novosielski >> > Sent by: >> > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >> > list Cc: Subject: [EXTERNAL] >> > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >> > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >> > guests Date: Wed, Jan 15, 2020 4:11 PM >> > >> > Hi there, >> > >> > I know some of the Spectrum Scale developers look at this list. >> > I?m having a little trouble with support on this problem. >> > >> > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >> > guests with a portability layer that has been installed via >> > gpfs.gplbin RPMs that we built at our site and have used to >> > install GPFS all over our environment. We?ve not seen this problem >> > so far on any physical hosts, but have now experienced it on guests >> > running on number of our KVM hypervisors, across vendors and >> > firmware versions, etc. At one time I thought it was all happening >> > on systems using Mellanox virtual functions for Infiniband, but >> > we?ve now seen it on VMs without VFs. There may be an SELinux >> > interaction, but some of our hosts have it disabled outright, some >> > are Permissive, and some were working successfully with 5.0.2.x >> > GPFS. >> > >> > What I?ve been instructed to try to solve this problem has been to >> > run ?mmbuildgpl?, and it has solved the problem. I don?t consider >> > running "mmbuildgpl" a real solution, however. If RPMs are a >> > supported means of installation, it should work. Support told me >> > that they?d seen this solve the problem at another site as well. >> > >> > Does anyone have any more information about this problem/whether >> > there?s a fix in the pipeline, or something that can be done to >> > cause this problem that we could remedy? Is there an easy place to >> > see a list of eFixes to see if this has come up? I know it?s very >> > similar to a problem that happened I believe it was after 5.0.2.2 >> > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >> > >> > Below is a sample of the crash output: >> > >> > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >> > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >> > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >> > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >> > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >> > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >> > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >> > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >> > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >> > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >> > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >> > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >> > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >> > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >> > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >> > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >> > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >> > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >> > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >> > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >> > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >> > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >> > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >> > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >> > [] kfree+0x13c/0x140 [ 156.760749] RSP: >> > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >> > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >> > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >> > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >> > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >> > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >> > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >> > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >> > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >> > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >> > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >> > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >> > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >> > 156.775154] [] >> > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >> > [] >> > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >> > >> > >> [mmfs26] >> > [ 156.779378] [] >> > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >> cjjjP10ext_cred_t+0x46a/0x7e0 >> > >> > >> [mmfs26] >> > [ 156.781689] [] ? >> > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >> > >> > >> [mmfs26] >> > [ 156.783565] [] >> > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >> > >> > >> [mmfs26] >> > [ 156.786228] [] >> > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >> > >> > >> [mmfs26] >> > [ 156.788681] [] ? >> > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >> > [mmfs26] [ 156.790448] [] >> > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >> > >> > >> [mmfs26] >> > [ 156.793032] [] ? >> > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >> > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >> > [mmfslinux] [ 156.795838] [] ? >> > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >> c0 >> > >> > >> [mmfs26] >> > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >> > 156.798763] [] ? d_alloc+0x60/0x70 [ >> > 156.799700] [] lookup_real+0x23/0x60 [ >> > 156.800651] [] __lookup_hash+0x42/0x60 [ >> > 156.801675] [] lookup_slow+0x42/0xa7 [ >> > 156.802634] [] link_path_walk+0x80f/0x8b0 [ >> > 156.803666] [] path_lookupat+0x7a/0x8b0 [ >> > 156.804690] [] ? lru_cache_add+0xe/0x10 [ >> > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >> > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >> > 156.807817] [] filename_lookup+0x2b/0xc0 [ >> > 156.808834] [] user_path_at_empty+0x67/0xc0 [ >> > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >> > 156.811017] [] user_path_at+0x11/0x20 [ >> > 156.811983] [] vfs_fstatat+0x63/0xc0 [ >> > 156.812951] [] SYSC_newstat+0x2e/0x60 [ >> > 156.813931] [] ? trace_do_page_fault+0x56/0x150 >> > [ 156.815050] [] SyS_newstat+0xe/0x10 [ >> > 156.816010] [] system_call_fastpath+0x25/0x2a [ >> > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >> > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >> > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >> > 41 55 41 54 [ 156.822192] RIP [] >> > kfree+0x13c/0x140 [ 156.823180] RSP [ >> > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >> > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >> > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >> > 0xffffffff80000000-0xffffffffbfffffff) >> > >> > -- ____ || \\UTGERS, >> > |---------------------------*O*--------------------------- ||_// >> > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >> > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >> > Campus || \\ of NJ | Office of Advanced Research Computing - >> > MSB C630, Newark `' >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> > >> > >> > >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> >> - -- >> ____ >> || \\UTGERS, |----------------------*O*------------------------ >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >> `' >> -----BEGIN PGP SIGNATURE----- >> >> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >> =9rKb >> -----END PGP SIGNATURE----- >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Fri Jan 17 17:39:32 2020 From: ulmer at ulmer.org (Stephen Ulmer) Date: Fri, 17 Jan 2020 12:39:32 -0500 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> References: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Message-ID: Having a sanctioned way to compile targeting a version of the kernel that is installed ? but not running ? would be helpful in many circumstances. ? Stephen > On Jan 17, 2020, at 11:58 AM, Ryan Novosielski wrote: > > Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: > > [root at master config]# diff env.mcr env.mcr-1062.9.1 > 4,5c4,5 > < #define LINUX_KERNEL_VERSION 31000999 > < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 > --- >> #define LINUX_KERNEL_VERSION 31001062 >> #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 > > > ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. > > Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). > > Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: >> >> Hi Ryan, >> >> My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. >> >> The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: gpfsug main discussion list >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Fri, Jan 17, 2020 10:56 AM >> >> That /is/ interesting. >> >> I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? >> >> -- >> ____ >> || \\UTGERS, |---------------------------*O*--------------------------- >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark >> `' >> >>> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >>> >>> ? >>> Hi Ryan, >>> >>> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >>> >>> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >>> >>> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >>> >>> Regards, >>> >>> Felipe >>> >>> ---- >>> Felipe Knop knop at us.ibm.com >>> GPFS Development and Security >>> IBM Systems >>> IBM Building 008 >>> 2455 South Rd, Poughkeepsie, NY 12601 >>> (845) 433-9314 T/L 293-9314 >>> >>> >>> >>> ----- Original message ----- >>> From: Ryan Novosielski >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> To: "gpfsug-discuss at spectrumscale.org" >>> Cc: >>> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >>> Date: Thu, Jan 16, 2020 4:33 PM >>> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi Felipe, >>> >>> I either misunderstood support or convinced them to take further >>> action. It at first looked like they were suggesting "mmbuildgpl fixed >>> it: case closed" (I know they wanted to close the SalesForce case >>> anyway, which would prevent communication on the issue). At this >>> point, they've asked for a bunch more information. >>> >>> Support is asking similar questions re: the speculations, and I'll >>> provide them with the relevant output ASAP, but I did confirm all of >>> that, including that there were no stray mmfs26/tracedev kernel >>> modules anywhere else in the relevant /lib/modules PATHs. In the >>> original case, I built on a machine running 3.10.0-957.27.2, but >>> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >>> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >>> before, and rebuilding once the build system was running >>> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >>> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >>> you build against either the wrong kernel version or the wrong GPFS >>> version, both will appear right in the filename of the gpfs.gplbin RPM >>> you build. Mine is called: >>> >>> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >>> >>> Anyway, thanks for your response; I know you might not be >>> following/working on this directly, but I figured the extra info might >>> be of interest. >>> >>> On 1/16/20 8:41 AM, Felipe Knop wrote: >>>> Hi Ryan, >>>> >>>> I'm aware of this ticket, and I understand that there has been >>>> active communication with the service team on this problem. >>>> >>>> The crash itself, as you indicate, looks like a problem that has >>>> been fixed: >>>> >>>> https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >>> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >>>> >>>> The fact that the problem goes away when *mmbuildgpl* is issued >>>> appears to point to some incompatibility with kernel levels and/or >>>> Scale version levels. Just speculating, some possible areas may >>>> be: >>>> >>>> >>>> * The RPM might have been built on a version of Scale without the >>>> fix * The RPM might have been built on a different (minor) version >>>> of the kernel * Somehow the VM picked a "leftover" GPFS kernel >>>> module, as opposed to the one included in gpfs.gplbin -- given >>>> that mmfsd never complained about a missing GPL kernel module >>>> >>>> >>>> Felipe >>>> >>>> ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >>>> Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >>>> (845) 433-9314 T/L 293-9314 >>>> >>>> >>>> >>>> >>>> ----- Original message ----- From: Ryan Novosielski >>>> Sent by: >>>> gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >>>> list Cc: Subject: [EXTERNAL] >>>> [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >>>> Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >>>> guests Date: Wed, Jan 15, 2020 4:11 PM >>>> >>>> Hi there, >>>> >>>> I know some of the Spectrum Scale developers look at this list. >>>> I?m having a little trouble with support on this problem. >>>> >>>> We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >>>> guests with a portability layer that has been installed via >>>> gpfs.gplbin RPMs that we built at our site and have used to >>>> install GPFS all over our environment. We?ve not seen this problem >>>> so far on any physical hosts, but have now experienced it on guests >>>> running on number of our KVM hypervisors, across vendors and >>>> firmware versions, etc. At one time I thought it was all happening >>>> on systems using Mellanox virtual functions for Infiniband, but >>>> we?ve now seen it on VMs without VFs. There may be an SELinux >>>> interaction, but some of our hosts have it disabled outright, some >>>> are Permissive, and some were working successfully with 5.0.2.x >>>> GPFS. >>>> >>>> What I?ve been instructed to try to solve this problem has been to >>>> run ?mmbuildgpl?, and it has solved the problem. I don?t consider >>>> running "mmbuildgpl" a real solution, however. If RPMs are a >>>> supported means of installation, it should work. Support told me >>>> that they?d seen this solve the problem at another site as well. >>>> >>>> Does anyone have any more information about this problem/whether >>>> there?s a fix in the pipeline, or something that can be done to >>>> cause this problem that we could remedy? Is there an easy place to >>>> see a list of eFixes to see if this has come up? I know it?s very >>>> similar to a problem that happened I believe it was after 5.0.2.2 >>>> and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >>>> >>>> Below is a sample of the crash output: >>>> >>>> [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >>>> opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >>>> ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >>>> mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >>>> iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >>>> mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >>>> ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >>>> ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >>>> iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >>>> xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >>>> iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >>>> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >>>> ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >>>> virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >>>> mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >>>> pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >>>> crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >>>> devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >>>> virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >>>> 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >>>> ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >>>> Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >>>> 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >>>> ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >>>> [] kfree+0x13c/0x140 [ 156.760749] RSP: >>>> 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >>>> 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >>>> 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >>>> ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >>>> 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >>>> 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >>>> 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >>>> ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >>>> GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >>>> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >>>> 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >>>> 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >>>> 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >>>> 156.775154] [] >>>> cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >>>> [] >>>> _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >>> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >>>> >>>> >>> [mmfs26] >>>> [ 156.779378] [] >>>> _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >>> cjjjP10ext_cred_t+0x46a/0x7e0 >>>> >>>> >>> [mmfs26] >>>> [ 156.781689] [] ? >>>> _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >>>> >>>> >>> [mmfs26] >>>> [ 156.783565] [] >>>> _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >>> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >>>> >>>> >>> [mmfs26] >>>> [ 156.786228] [] >>>> _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >>> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >>>> >>>> >>> [mmfs26] >>>> [ 156.788681] [] ? >>>> _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >>>> [mmfs26] [ 156.790448] [] >>>> _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >>> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >>>> >>>> >>> [mmfs26] >>>> [ 156.793032] [] ? >>>> _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >>>> 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >>>> [mmfslinux] [ 156.795838] [] ? >>>> _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >>> c0 >>>> >>>> >>> [mmfs26] >>>> [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >>>> 156.798763] [] ? d_alloc+0x60/0x70 [ >>>> 156.799700] [] lookup_real+0x23/0x60 [ >>>> 156.800651] [] __lookup_hash+0x42/0x60 [ >>>> 156.801675] [] lookup_slow+0x42/0xa7 [ >>>> 156.802634] [] link_path_walk+0x80f/0x8b0 [ >>>> 156.803666] [] path_lookupat+0x7a/0x8b0 [ >>>> 156.804690] [] ? lru_cache_add+0xe/0x10 [ >>>> 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >>>> 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >>>> 156.807817] [] filename_lookup+0x2b/0xc0 [ >>>> 156.808834] [] user_path_at_empty+0x67/0xc0 [ >>>> 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >>>> 156.811017] [] user_path_at+0x11/0x20 [ >>>> 156.811983] [] vfs_fstatat+0x63/0xc0 [ >>>> 156.812951] [] SYSC_newstat+0x2e/0x60 [ >>>> 156.813931] [] ? trace_do_page_fault+0x56/0x150 >>>> [ 156.815050] [] SyS_newstat+0xe/0x10 [ >>>> 156.816010] [] system_call_fastpath+0x25/0x2a [ >>>> 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >>>> df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >>>> e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >>>> 41 55 41 54 [ 156.822192] RIP [] >>>> kfree+0x13c/0x140 [ 156.823180] RSP [ >>>> 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >>>> Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >>>> Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >>>> 0xffffffff80000000-0xffffffffbfffffff) >>>> >>>> -- ____ || \\UTGERS, >>>> |---------------------------*O*--------------------------- ||_// >>>> the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >>>> University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >>>> Campus || \\ of NJ | Office of Advanced Research Computing - >>>> MSB C630, Newark `' >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> - -- >>> ____ >>> || \\UTGERS, |----------------------*O*------------------------ >>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >>> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >>> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >>> `' >>> -----BEGIN PGP SIGNATURE----- >>> >>> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >>> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >>> =9rKb >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From heinrich.billich at id.ethz.ch Mon Jan 20 15:06:52 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:06:52 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: Thank you, this did work. I did install efix9 for 5.0.4.1 using yum, just with a plain ?yum update? after installing the base version. I placed efix and base rpms in different yum repos and did disable the efix-repo while installing the base version, and vice versa. Kind regards, Heiner From: on behalf of IBM Spectrum Scale Reply to: gpfsug main discussion list Date: Wednesday, 15 January 2020 at 22:00 To: gpfsug main discussion list Cc: "gpfsug-discuss-bounces at spectrumscale.org" Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:20:46 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:20:46 +0000 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Message-ID: Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:15:33 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:15:33 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hello Venkat, Thank you very much, upgrading to 5.0.4.1 did indeed fix the issue. AFM now compiles the list of pending changes in a few hours. Before we estimated >20days. We had to increase disk space in /var/mmfs/afm/ and /var/mmfs/tmp/ to allow AFM to store all intermediate file lists. The manual did recommend to provide much disk space in /var/mmfs/afm/ only, but some processes doing a resync placed lists in /var/mmfs/tmp/, too. Cheers, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Tuesday, 14 January 2020 at 17:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Jan 20 17:32:07 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 20 Jan 2020 23:02:07 +0530 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? In-Reply-To: References: Message-ID: While the recovery is running, reading the uncached files (evicted files) gets blocked until the recovery completes queueing the recovery operations. This is to make sure that recovery executes all the dependent operations first. For example, evicted file might have been renamed in the cache, but not yet replicated to home site and the fileset went into the recovery state. First recovery have to perform rename operation to the home site and then allow read operation on it. Read on the uncached files may get blocked if the cache state is in Recovery/NeedsResync/Unmounted/Dropped/Stopped states. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/20/2020 08:50 PM Subject: [EXTERNAL] [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=zwJSIhs7R020CQybqTb86CxBGIhtULCJo_QEggx05Y4&s=TGxHcd4HcDF0hv621ilqJ56r26Ah4rlmNM7PcJ3yLEA&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkr at lbl.gov Thu Jan 23 22:16:20 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Thu, 23 Jan 2020 14:16:20 -0800 Subject: [gpfsug-discuss] UPDATE Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thanks for your responses to the poll. We?re still working on a venue, but working towards: March 30 - New User Day (Tuesday) April 1&2 - Regular User Group Meeting (Wednesday & Thursday) Once it?s confirmed we?ll post something again. Best, Kristy. > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From agostino.funel at enea.it Mon Jan 27 10:26:55 2020 From: agostino.funel at enea.it (Agostino Funel) Date: Mon, 27 Jan 2020 11:26:55 +0100 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? Message-ID: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel From S.J.Thompson at bham.ac.uk Mon Jan 27 10:29:52 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 27 Jan 2020 10:29:52 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: References: Message-ID: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> 4.2.3 on Fix Central is called IBM Spectrum Scale, not GPFS. Try: https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.2.3&platform=Linux+64-bit,x86_64&function=all Simon ?On 27/01/2020, 10:27, "gpfsug-discuss-bounces at spectrumscale.org on behalf of agostino.funel at enea.it" wrote: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From stockf at us.ibm.com Mon Jan 27 11:33:11 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Mon, 27 Jan 2020 11:33:11 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> References: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk>, Message-ID: An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 29 13:05:30 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 29 Jan 2020 13:05:30 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Message-ID: Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Jan 30 14:43:54 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 30 Jan 2020 15:43:54 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? The following GUI > refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui > nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time > which may lead to locking or congestion issues, actually the logs show messages like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ??little-endian ?server takes a long > time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? (5.0.3.2 efix4). We run a mix > of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Thu Jan 30 15:13:06 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Thu, 30 Jan 2020 15:13:06 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: <6FA0119F-1067-4528-A4D8-54FA61F19BE0@id.ethz.ch> Hello Uli, Thank you. Yes, I noticed that some commands like 'ipmitool fru' or 'rinv' take long or very long on le systems- i've seen up to 7 minutes. I tried to reset the bmc with 'ipmitool mc reset cold' but this breaks the os access to ipmi, you need to unload/load the kernel modules in the right order to fix - or reboot. I also needed to restart goconserver to restore the console connection. Hence resetting the bmc is no real option for little-endian ESS server. I don't know yet whether the bmc reset fixed anything. So we'll wait for 5.3.5 Kind regards, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== ?On 30.01.20, 15:44, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Ulrich Sibiller" wrote: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rohwedder at de.ibm.com Thu Jan 30 15:31:32 2020 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Thu, 30 Jan 2020 16:31:32 +0100 Subject: [gpfsug-discuss] =?utf-8?q?gui=5Frefresh=5Ftask=5Ffailed_for_HW?= =?utf-8?q?=5FINVENTORY_with_two=09active_GUI_nodes?= In-Reply-To: References: Message-ID: Hello, The GUI tasks which are not daily tasks will start periodically at a random time. The exception are daily tasks which are defined at fixed start times. It seems this is the issue you are experiencing, as the HW_INVENTORY task only runs once a day adn starts at identical times on both GUI nodes. Tweaking the cache database is unfortunately not a workaround as the hard coded and fixed starting times will be reset for every GUI restart. I have created a task to address this issue in a future release. We could for example add a random delay to the daily tasks, or a fixed delay based on the number of GUI nodes that are active. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 162 4159920 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 29.01.2020 14:41 Subject: [EXTERNAL] [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=3j7GTkRFLANP-V9nMPiOuUX-2D3ybbNTEc64kU-OQAM&s=sR1v63lEVWuEZTBgspG3imB0MN_-7ggA6zrmyvqfCzE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 14272346.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ewahl at osc.edu Thu Jan 30 15:52:27 2020 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 30 Jan 2020 15:52:27 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: Interesting. We just deployed an ESS here and are running into a very similar problem with the gui refresh it appears. Takes my ppc64le's about 45 seconds to run rinv when they are idle. I had just opened a support case on this last evening. We're on ESS 5.3.4 as well. I will wait to see what support says. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Ulrich Sibiller Sent: Thursday, January 30, 2020 9:44 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? > The following GUI refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times > succeeds. We do run two gui nodes per cluster and I noted that both > servers seem run the HW_INVENTORY at the exact same time which may > lead to locking or congestion issues, actually the logs show messages > like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ?? > little-endian ?server takes a long time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an > issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ From janfrode at tanso.net Thu Jan 30 16:59:40 2020 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 30 Jan 2020 17:59:40 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: I *think* this was a known bug in the Power firmware included with 5.3.4, and that it was fixed in the FW860.70. Something hanging/crashing in IPMI. -jf tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward : > Interesting. We just deployed an ESS here and are running into a very > similar problem with the gui refresh it appears. Takes my ppc64le's about > 45 seconds to run rinv when they are idle. > I had just opened a support case on this last evening. We're on ESS > 5.3.4 as well. I will wait to see what support says. > > Ed Wahl > Ohio Supercomputer Center > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Ulrich Sibiller > Sent: Thursday, January 30, 2020 9:44 AM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY > with two active GUI nodes > > On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > > Hello, > > > > Can I change the times at which the GUI runs HW_INVENTORY and related > tasks? > > > > we frequently get messages like > > > > gui_refresh_task_failed GUI WARNING 12 hours > ago > > The following GUI refresh task(s) failed: HW_INVENTORY > > > > The tasks fail due to timeouts. Running the task manually most times > > succeeds. We do run two gui nodes per cluster and I noted that both > > servers seem run the HW_INVENTORY at the exact same time which may > > lead to locking or congestion issues, actually the logs show messages > > like > > > > EFSSA0194I Waiting for concurrent operation to complete. > > > > The gui calls ?rinv? on the xCat servers. Rinv for a single > > little-endian server takes a long time ? about 2-3 minutes , while it > finishes in about 15s for big-endian server. > > > > Hence the long runtime of rinv on little-endian systems may be an > > issue, too > > > > We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems > > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a > separate xCat/ems server for each type. The GUI nodes are ppc64le. > > > > We did see this issue with several gpfs version on the gui and with at > least two ESS/xCat versions. > > > > Just to be sure I did purge the Posgresql tables. > > > > I did try > > > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are > difficult. > > > I have seen the same on ppc64le. From time to time it recovers but then it > starts again. The timeouts are okay, it is the hardware. I haven opened a > call at IBM and they suggested upgrading to ESS 5.3.5 because of the new > firmwares which I am currently doing. I can dig out more details if you > want. > > Uli > -- > Science + Computing AG > Vorstandsvorsitzender/Chairman of the board of management: > Dr. Martin Matzke > Vorstand/Board of Management: > Matthias Schempp, Sabine Hohenstein > Vorsitzender des Aufsichtsrats/ > Chairman of the Supervisory Board: > Philippe Miltin > Aufsichtsrat/Supervisory Board: > Martin Wibbe, Ursula Morgenstern > Sitz/Registered Office: Tuebingen > Registergericht/Registration Court: Stuttgart Registernummer/Commercial > Register No.: HRB 382196 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Thu Jan 2 16:05:40 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 2 Jan 2020 16:05:40 +0000 Subject: [gpfsug-discuss] GPFS 5.0.4.1? Message-ID: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From richard.rupp at us.ibm.com Thu Jan 2 17:18:42 2020 From: richard.rupp at us.ibm.com (RICHARD RUPP) Date: Thu, 2 Jan 2020 12:18:42 -0500 Subject: [gpfsug-discuss] GPFS 5.0.4.1? In-Reply-To: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> References: <7A0BAAFD-AC63-4122-9D7A-BBC47B31E0B7@rutgers.edu> Message-ID: The Developer Edition can be access via https://www.ibm.com/us-en/marketplace/scale-out-file-and-object-storage It is a free edition, without support, and should not be used for production. There are licensed editions of 5.0.4.1 with support for production environments. 5.0.4.1 was released on 11/21/19 and it is available on Fix Central for IBM customers under maintenance. Regards, Richard Rupp, Sales Specialist, Phone: 1-347-510-6746 From: Ryan Novosielski To: gpfsug main discussion list Date: 01/02/2020 11:06 AM Subject: [EXTERNAL] [gpfsug-discuss] GPFS 5.0.4.1? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi there, I notice that the ?Developer Edition? shared the other day provides GPFS 5.0.4.1. I?ve been unable to find that on Fix Central otherwise, or Lenovo?s new ESD site. Is that providing a preview of software yet to be released, and if so, is there any indication when 5.0.4.1 might be released? Always reluctant to deploy a x.0 in production, but also don?t want to deploy something older than what?s available. Thanks in advance, -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EXL-jEd1jmdzvOIhT87C7SIqmAS9uhVQ6J3kObct4OY&m=0YVtwpJq3PdmnToqO4d_GVOAxzzahyIi1xaFIROEs_w&s=Yv5UWi2D1yQpSOodfwfPq-4FC4iStKj_yXbE25Vrul4&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From kkr at lbl.gov Mon Jan 6 23:41:28 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Mon, 6 Jan 2020 15:41:28 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thank you to the 18 wonderful people who filled out the survey. However, there are well more than 18 people at any given UG meeting. Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting Happy New Year. Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose wrote: > > Hello, > > It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. > > Best wishes to all in the new year. > > -Kristy > > > Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faryarag at in.ibm.com Tue Jan 7 10:47:39 2020 From: faryarag at in.ibm.com (Farida Yaragatti1) Date: Tue, 7 Jan 2020 16:17:39 +0530 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Message-ID: Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. ? The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: ? IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Tue Jan 7 11:58:13 2020 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Tue, 7 Jan 2020 11:58:13 +0000 Subject: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) In-Reply-To: References: Message-ID: <52b5b5557f3f44ce890fe141b670014b@huk-coburg.de> Hallo Farida, can you check your Links, it seems these doesnt work for the poeples outside the IBM network. Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder, Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss-bounces at spectrumscale.org Im Auftrag von Farida Yaragatti1 Gesendet: Dienstag, 7. Januar 2020 11:48 An: gpfsug-discuss at spectrumscale.org Cc: Wesley Jones ; Mohsin A Inamdar ; Sumit Kumar43 ; Ricardo Daniel Zamora Ruvalcaba ; Rajan Mishra1 ; Pramod T Achutha ; Rezaul Islam ; Ravindra Sure Betreff: [gpfsug-discuss] Introduction: IBM Elastic Storage System (ESS) 3000 (Spectrum Scale) Hello All, My name is Farida Yaragatti and I am part IBM Elastic Storage System (ESS) 3000 Team, India Systems Development Lab, IBM India Pvt. Ltd. IBM Elastic Storage System (ESS) 3000 installs and upgrade GPFS using Containerization. For more details, please go through following links which has been published and released recently in December 9th 2019. The IBM Lab Services team can install an Elastic Storage Server 3000 as an included service part of acquisition. Alternatively, the customer?s IT team can do the installation. > The ESS 3000 quick deployment documentation is at the following web page: https://ibm.biz/Bdz7qb The following documents provide information that you need for proper deployment, installation, and upgrade procedures for an IBM ESS 3000: > IBM ESS 3000: Planning for the system, service maintenance packages, and service procedures: https://ibm.biz/Bdz7qp Our team would like to participate in Spectrum Scale user group events which is happening across the world as we are using Spectrum Scale in 2020. Please let us know how we can initiate or post our submission for the events. Regards, Farida Yaragatti ESS Deployment (Testing Team), India Systems Development Lab IBM India Pvt. Ltd., EGL D Block, 6th Floor, Bangalore, Karnataka, 560071, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Tue Jan 7 16:32:26 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 16:32:26 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> Message-ID: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: From novosirj at rutgers.edu Tue Jan 7 17:06:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Tue, 7 Jan 2020 17:06:54 +0000 Subject: [gpfsug-discuss] Snapshot migration of any kind? Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 We're in the process of figuring out how to rewrite our fileystems in order to take advantage of the 5.0.x variable subblock size enhancement. However, we keep generally 6 weeks of snapshots as a courtesy to the user community. I assume the answer is no, but is there any option for migrating snapshots, or barring that, any recommended reading for what you /can/ do with a snapshot beyond create/destroy? Thanks in advance. I'm having trouble coming up with any useful search terms. - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXhS6qQAKCRCZv6Bp0Ryx vicZAJsHI/z7DXc8EV+sqExhVwMPomoBSQCgyIHgS1Z7RlhQMYAySvDOINAUWPk= =CqPO -----END PGP SIGNATURE----- From kywang at us.ibm.com Tue Jan 7 17:11:35 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 12:11:35 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=m2_UDb09pxCtr3QQCy-6gDUzpw-o_zJQig_xI3C2_1c&m=5FdKJTgMapLheSzY_a5KkY9OQL5m9TwMBD0Bsdt6p58&s=t7Z10OpvkLnFZB5iiF9k8KGVE4R1yitIwUgFfye2tuU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B820169.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B859563.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B879604.gif Type: image/gif Size: 108 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 17:23:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 17:23:40 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> Message-ID: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: From kywang at us.ibm.com Tue Jan 7 19:13:13 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:13:13 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B974314.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B187982.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B270995.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1B968493.gif Type: image/gif Size: 109 bytes Desc: not available URL: From carlz at us.ibm.com Tue Jan 7 19:28:46 2020 From: carlz at us.ibm.com (Carl Zetie - carlz@us.ibm.com) Date: Tue, 7 Jan 2020 19:28:46 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation Message-ID: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://www.surveygizmo.com/s3/5356255/ee853c3af96a The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 69557 bytes Desc: image001.png URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:39:22 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:39:22 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> Message-ID: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: From bbanister at jumptrading.com Tue Jan 7 19:40:43 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 7 Jan 2020 19:40:43 +0000 Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation In-Reply-To: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> References: <3A32DB49-9552-465F-9727-C8E661A7E6EC@us.ibm.com> Message-ID: Hi Carl, Without going through the form completely, is there a short breakdown of what features are available to test in the 5.0.5 beta? -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Carl Zetie - carlz at us.ibm.com Sent: Tuesday, January 7, 2020 1:29 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Spectrum Scale 5.0.5 Beta participation [EXTERNAL EMAIL] We are accepting nominations for IBM Spectrum Scale 5.0.5 Beta participation here: https://urldefense.com/v3/__https://www.surveygizmo.com/s3/5356255/ee853c3af96a__;!!GSt_xZU7050wKg!-45kSCmNkDN_VaPV5a_MRw-agaDN2iav0KlVEKh7tgnWfA2U0zeE7zenEXkA3iFaVHxF$ The Beta begins in mid-February. Please note that you?ll need your IBM account rep to nominate you. Carl Zetie Program Director Offering Management Spectrum Scale & Spectrum Discover ---- (919) 473 3318 ][ Research Triangle Park carlz at us.ibm.com From kywang at us.ibm.com Tue Jan 7 19:50:31 2020 From: kywang at us.ibm.com (Kuei-Yu Wang-Knop) Date: Tue, 7 Jan 2020 14:50:31 -0500 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, * but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15076604.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15250423.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15764009.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169856.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 15169125.gif Type: image/gif Size: 110 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Tue Jan 7 19:51:19 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Tue, 7 Jan 2020 19:51:19 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: How do I do that? (thnks!) Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:50 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You can open an RFE (Request for Enhancement) for this issue if you would like this function to be considered for future versions. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we st]"Popescu, Razvan" ---01/07/2020 02:39:30 PM---Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have th From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop Cc: gpfsug main discussion list Date: 01/07/2020 02:39 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Thank you very much, Kuei. It?s now clear where we stand, even though I would have liked to have that added selectivity in mmedquota. We use filesets to separate classes of projects and classes of storage (backup/noBackup for example), and thus one user or one group(=project), has various resource allocations across filesets (enforced by quotas). Sometimes we need to roll back only certain allocations and leave other untouched ?. If no one else encountered this need so far, I guess we twisted the model a bit too much ?. Maybe we can add this option to some list of desired new features for coming versions?... Thanks, Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 2:13 PM To: "Popescu, Razvan" Cc: gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, You are right, the command below yields to an error (as it interprets fs9:fset7 as a user name). Apologies for the confusion. This is what I see in my system: # mmedquota -d -u pfs004 fs9:fset7 fs1 USR default quota is off fs9:fset7 is not valid user # I must have confused the command names in the previous note. Instead of "mmedquota -d" command I meant "mmlsquota -d" (??) # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -d fs9:fset9 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset9 USR 102400 1048576 | 10000 22222 default on fs9 fset9 GRP 0 0 | 0 0 default off #mmlsquota -d -u fs9 Default Block Limits(KB) | Default File Limits Filesystem type quota limit | quota limit Remarks fs9 USR 102400 1048576 | 10000 0 # Upon further investigation, the current behavior of 'mmedquota -d -u ' is restore the default quotas for the particular user on all filesystems and filesets. The ability to restore the default limits of a user for selected filesets and filesystems is not available at the moment. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before]"Popescu, Razvan" ---01/07/2020 12:23:47 PM---Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution From: "Popescu, Razvan" To: Kuei-Yu Wang-Knop , gpfsug main discussion list Date: 01/07/2020 12:23 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default ________________________________ Kuei, Thanks for replying. I did open a ticket before the holidays, but it didn?t yield a solution. I just opened another one? My very specific question to you is: ?Where have you seen this particular syntax work?? The 5.0.3 & 5.0.4 mmedquota man pages do not mention the availability of a filesystem:fileset parameter (as in your example: ?fs9:fset7?) and my testing rejected such a parameter. What made you think this would work? ? # mmedquota -d -u pfs004 fs9:fset7 ^^^^^^^^^^^^^ Many thanks! Razvan -- From: Kuei-Yu Wang-Knop Date: Tuesday, January 7, 2020 at 12:11 PM To: "Popescu, Razvan" , gpfsug main discussion list Subject: RE: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, Perhaps you should open a ticket (so that specific configuration data could be collected and analyzed) on this topic. We would look at your system configuration, the code level to figure out whether what you are seeing is a problem or it is just an expected behavior or a limitation; there are some limitations, specifically moving default limits between file system and fileset default scope, that may not work for your scenario. Thanks, Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year!]"Popescu, Razvan" ---01/07/2020 11:32:42 AM---Hi Kuei-Yu (et al.) Happy New Year! From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/07/2020 11:32 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Kuei-Yu (et al.) Happy New Year! I?d like to reiterate my follow-up question to your comments ? in particular to the line copied below, which mentions a behavior that I?m seeking for this command, but cannot reproduce (in vers, 5.0.3 at Linux), namely (with my highlights): # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits The reduction of scope to a named filesystem/fileset is what I?m seeking, *but* at least the 5.0.3 version seems to reject that parameter with an error, and apply the user?s default restoration to *all filesystems and filesets* Are you using a different version? Or a different implementations? I?m running SS 5.0.3 on Linux x64. I apologize for having to press this point, but this matter is of a certain importance to us and it appears that the public documentation is mum in this regard. Furthermore, the IBM support, at the first pass, was quite confused and unfocused. (I?m trying with a new case now, but my hopes are low). You seem to be an IBM insider, so ?. I count on your help ?. Sorry, for the insistence. ? Best, Razvan Popescu Columbia Univ. -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 3:56 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Razvan, mmedquota -d -u fs:fset: -d Reestablish default quota limits for a specific user, group, or fileset that had an explicit quota limit set by a previous invocation of the mmedquota command. This option will assign the default quota to the user. The quota entry type will change from "e" to "d_fset". You may need to play a little bit with your system to get the result as you can have default quota per file system set and default quota per fileset enabled. An exemple to illustrate User pfs004 in filesystem fs9 and fileset fset7 has explicit quota set: # mmrepquota -u -v fs9 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none e <=== explicit # mmlsquota -d fs9:fset7 Default Block Limits(KB) | Default File Limits Filesystem Fileset type quota limit | quota limit entryType fs9 fset7 USR 102400 1048576 | 10000 0 default on <=== default quota limits for fs9:fset7, the default fs9 fset7 GRP 0 0 | 0 0 i # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 33333 0 none <=== explicit # mmedquota -d -u pfs004 fs9:fset7 <=== run mmedquota -d -u to get default limits # mmlsquota -u pfs004 fs9:fset7 Block Limits | File Limits Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks fs9 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none <=== takes the default value # mmrepquota -u -v fs9:fset7 | grep pfs004 pfs004 fset7 USR 1088 102400 1048576 0 none | 13 10000 0 0 none d_fset <=== now user pfs004 in fset7 takes the default limits # ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what]"Popescu, Razvan" ---12/19/2019 02:28:51 PM---I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:28 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I see. May I ask one follow-up question, please: what is ?mmedquota -d -u ? supposed to do in this case? Really appreciate your assistance. Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default >> To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) Currently there is no function to revert an explicit quota entry (e) to initial (i) entry. Kuei ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different tho]"Popescu, Razvan" ---12/19/2019 02:18:54 PM---Thanks for your kind reply. My problem is different though. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 12/19/2019 02:18 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Thanks for your kind reply. My problem is different though. I have set a fileset default quota (doing all the steps you recommended) and all was Ok. During operations I have edited *individual* quotas, for example to increase certain user?s allocations. Now, I want to *revert* (change back) one of these users to the (fileset) default quota ! For example, I have used one user account to test the mmedquota command setting his limits to a certain value (just testing). I?d like now to make that user?s quota be the default fileset quota, and not just numerically, but have his quota record follow the changes in fileset default quota limits. To make it more technical ?. This fellow?s quota entryType is now ?e? . I want to change it back to entryType ?I?. (I hope I?m not talking nonsense here) mmedquota?s ?-d? option is supposed to reinstate the defaults, but it doesn?t seem to work for fileset based quotas ? !?! Razvan -- From: on behalf of Kuei-Yu Wang-Knop Reply-To: gpfsug main discussion list Date: Thursday, December 19, 2019 at 2:06 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Quota: revert user quota to FILESET default It sounds like you would like to have default perfileset quota enabled. Have you tried to enable the default quota on the filesets and then set the default quota limits for those filesets? For example, in a filesystem fs9 and fileset fset9. File system fs9 has default quota on and --perfileset-quota enabled. # mmlsfs fs9 -Q --perfileset-quota flag value description ------------------- ------------------------ ----------------------------------- -Q user;group;fileset Quotas accounting enabled user;fileset Quotas enforced user;group;fileset Default quotas enabled --perfileset-quota Yes Per-fileset quota enforcement # Enable default user quota for fileset fset9, if not enabled yet, e.g. "mmdefquotaon -u fs9:fset9" Then set the default quota for this fileset using mmdefedquota" # mmdefedquota -u fs9:fset9 .. *** Edit quota limits for USR DEFAULT entry for fileset fset9 NOTE: block limits will be rounded up to the next multiple of the block size. block units may be: K, M, G, T or P, inode units may be: K, M or G. fs9: blocks in use: 0K, limits (soft = 102400K, hard = 1048576K) inodes in use: 0, limits (soft = 10000, hard = 22222) ... Hope that this helps. ------------------------------------ Kuei-Yu Wang-Knop IBM Scalable I/O development (845) 433-9333 T/L 293-9333, E-mail: kywang at us.ibm.com [Inactive hide details for "Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset]"Popescu, Razvan" ---12/19/2019 12:22:34 PM---Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? From: "Popescu, Razvan" To: "gpfsug-discuss at spectrumscale.org" Date: 12/19/2019 12:22 PM Subject: [EXTERNAL] [gpfsug-discuss] Quota: revert user quota to FILESET default Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?d like to revert a user?s quota to the fileset?s default, but ?mmedquota -d -u ? fails because I do have not set a filesystem default?. [root at xxx]# mmedquota -d -u user gsb USR default quota is off (SpectrumScale 5.0.3 Standard Ed. on RHEL7 x86) Is this a limitation of the current mmedquota implementation, or of something more profound?... I have several filesets within this filesystem, each with various quota structures. A filesystem-wide default quota didn?t seem useful so I never defined one; however I do have multiple fileset-level default quotas, and this is the level at which I?d like to be able to handle this matter? Have I hit a limitation of the implementation? Any workaround, if that?s the case? Many thanks, Razvan Popescu Columbia Business School _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 107 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 108 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 109 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 110 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 111 bytes Desc: image006.gif URL: From bipcuds at gmail.com Tue Jan 7 20:10:10 2020 From: bipcuds at gmail.com (Keith Ball) Date: Tue, 7 Jan 2020 15:10:10 -0500 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Message-ID: Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: cluster = literal_or(my.cluster.name) , groupBy = false filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, Keith -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 8 12:16:09 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 8 Jan 2020 12:16:09 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From knop at us.ibm.com Wed Jan 8 13:29:57 2020 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 8 Jan 2020 13:29:57 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: , <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu><794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu><746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu><770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu><4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu><8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image001.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 106 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image002.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 107 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image003.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 108 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image004.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 109 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image005.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 110 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.image006.gif at 01D5C569.EBA64CC0.gif Type: image/gif Size: 111 bytes Desc: not available URL: From heinrich.billich at id.ethz.ch Wed Jan 8 17:02:18 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 8 Jan 2020 17:02:18 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Message-ID: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From lgayne at us.ibm.com Wed Jan 8 18:15:47 2020 From: lgayne at us.ibm.com (Lyle Gayne) Date: Wed, 8 Jan 2020 18:15:47 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Thu Jan 9 19:27:30 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 19:27:30 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Message-ID: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rafael.Cezario at ibm.com Thu Jan 9 19:48:07 2020 From: Rafael.Cezario at ibm.com (Rafael Cezario) Date: Thu, 9 Jan 2020 16:48:07 -0300 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=efzA7AwXTDdK-0_uRBnvcy8-s5uewdL51EO34qmTe0I&m=xrmNBKF1K7yQh6tWtHfPemfaWt1wOT7LtKK83BFKE7g&s=HYvQUEzWuxhpP9FtEHHhY4ZV-UsGMJpGjccLEVgcPfk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Jan 9 20:24:36 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 9 Jan 2020 15:24:36 -0500 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmb ackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=369ydzzb59Q4zfz0T74pkucjHcKuR63z0UAf2aMqAz0&s=3za7Rn3o9V7oajWNFe-U8PvMH8hQLUyVVrHuFCind0g&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:00:59 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:00:59 +0000 Subject: [gpfsug-discuss] Quota: revert user quota to FILESET default In-Reply-To: References: <4DA994EE-3452-4783-B376-54ED17F56966@gsb.columbia.edu> <794A8C83-B179-4A7B-85F2-DC2EA97EFCDD@gsb.columbia.edu> <746C7F06-1F3A-4C5C-A2AA-BC0B2C52A0F9@gsb.columbia.edu> <770C6EE1-DD81-4E16-9A89-A16BA6E3282A@gsb.columbia.edu> <4A28FC24-C3AF-4296-920D-E8FB5080B8AA@gsb.columbia.edu> <8E54D44F-6E4B-49BC-B3A5-3DBBCACE106C@gsb.columbia.edu> Message-ID: Hi Jonathan, Thanks for you kind reply. Indeed, I can always do that. Best, Razvan -- ?On 1/8/20, 7:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Tue, 2020-01-07 at 19:39 +0000, Popescu, Razvan wrote: > Thank you very much, Kuei. It?s now clear where we stand, even > though I would have liked to have that added selectivity in > mmedquota. > Note in the meantime you could "simulate" this with a relatively simple script that grabs the quota information for the relevant user, uses mmsetquota to wipe all the quota information for the user and then some more mmsetquota to set all the ones you want. While not ideal the window of opportunity for the end user to exploit not having any quota's would be a matter of seconds. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rp2927 at gsb.columbia.edu Thu Jan 9 21:19:40 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:19:40 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: Thanks, I?ll set tonight?s run with that debug flag. Best, Razvan -- From: on behalf of IBM Spectrum Scale Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 3:24 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Under FS mount dir or $MMBACKUP_RECORD_ROOT dir if you set, mmbackup creates the following file that contains all backup candidate files. .mmbackupCfg/updatedFiles/.list* As a default, mmbackup deletes the file upon successful backup completion but keeps all temporary files until next mmbackup invocation if DEBUGmmbackup=2 is set. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for "Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selec]"Popescu, Razvan" ---01/09/2020 02:29:25 PM---Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) From: "Popescu, Razvan" To: gpfsug main discussion list Date: 01/09/2020 02:29 PM Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From rp2927 at gsb.columbia.edu Thu Jan 9 21:38:02 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Thu, 9 Jan 2020 21:38:02 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Hi Rafael, This looks awesomely promising, but I can?t find the info your refer here. My SCHEDLOGNAME points to /root/dsmsched.log but there is no file by that name in /root. I have the error and instrumentation logs (dsmerror.log and dsminstr.log) per their options, but not the scheduler. Could it be because I don?t run mmbackup via the TSM scheduler ?! (I run it as a cronjob, inside a little wrapper that takes care of preparing/deleting a snapshot for it). Must I run the scheduler to log the activity of the client? Thanks, Razvan -- From: on behalf of Rafael Cezario Reply-To: gpfsug main discussion list Date: Thursday, January 9, 2020 at 2:48 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Hello, Its possible. From the server schedule do you get the log configuration: # dsmc query options For example: SCHEDLOGNAME: /var/log/tsm/dsmsched.log # mmbackup /FS -t incremental -N Server --backup-threads 12 -v -L 6 --tsm-servers server --scope filesystem After that, do you check your log file /var/log/tsm/dsmsched.log: 01/09/20 00:51:45 Retry # 2 Normal File--> 1,356,789 /File/agent.log [Sent] 01/09/20 00:51:45 Retry # 1 Normal File--> 5,120,062 /File/agent.log.1 [Sent] 01/09/20 00:51:46 Successful incremental backup of '/File' Regards, Rafael From: "Popescu, Razvan" To: gpfsug main discussion list Date: 09/01/2020 16:27 Subject: [EXTERNAL] [gpfsug-discuss] Mmbackup -- list files backed up by incremental run Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi, I?m trying to find out which files have been selected to be backed up by a run of (incremental) mmbackup. Logging as high as ?-L 5? listed all the scanned files but gave no info about which ones have been sent to the backup server. Is there a way (anyway) to list the files selected by an incremental mmbackup run? Even using info from the SpecProtect server, if that?s feasible. Thanks, Razvan N. Popescu Research Computing Director Office: (212) 851-9298 razvan.popescu at columbia.edu Columbia Business School At the Very Center of Business _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From NSCHULD at de.ibm.com Fri Jan 10 09:00:53 2020 From: NSCHULD at de.ibm.com (Norbert Schuld) Date: Fri, 10 Jan 2020 10:00:53 +0100 Subject: [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge In-Reply-To: References: Message-ID: Hello Keith, please check for more recent versions of the bridge here: https://github.com/IBM/ibm-spectrum-scale-bridge-for-grafana Also updating Grafana to some newer version could help, found some older reports while searching for the error message. HTH Norbert From: Keith Ball To: gpfsug-discuss at spectrumscale.org Date: 07.01.2020 21:10 Subject: [EXTERNAL] [gpfsug-discuss] Grafana graph panels give "Cannot read property 'index' of undefined" using Grafana bridge Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi All, I am using the following combination of components on my GUI/pmcollector node: - RHEL 7.3 - Spectrum Scale 4.2.3.5 (actually part of a Lenovo DSS release) - gpfs.gss.pmcollector-4.2.3.5.el7.x86_64 - Python 3.6.8 - CherryPy 18.5.0 - Grafana bridge: no version actually appears in the python script, but a "buildDate.txt" file distributed with the bridge indicates "Thu Aug 16 10:48:21 CET 2016" (seems super-old for something downloaded in the last 2 months?).No other version info to be found in the script. It appears that I can add the bridge as a OpenTSDB-like data source to Gafana successfully (the "save & Test" says that it was successful and working). When I create a graph panel, I am getting completion for perfmon metrics/timeseries and tag/filter values (but not tag keys for some reason). However, whether I try to create my own simple graph, or use the canned dashboards (on the Scale wiki), every panel gives the same error (exclamation point in the red triangle in the upper-left corner of the graph): ??? Cannot read property 'index' of undefined An example query would be for gpfs_fs_bytes_read, Aggregator=avg, Disasble Downsampling, Filters: ? cluster = literal_or(my.cluster.name) , groupBy = false ? filesystem = literal_or(homedirs) , groupBy = false Anyone know what exactly the "Cannot read property 'index' of undefined" really means (i.e. what is causing it), or has had to debug this on their own perfmon and Grafana setup? Am I using incompatible versions of components? I do not see anything that looks like error messages in the Grafana bridge log file, nor in the Grafana log file. Does anyone have anything to suggest? Many Thanks, ?Keith _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=i4V0h7L9ElftZNfcuPIXmAHN2jl5TLcuyFLqtinu4j8&m=cqPhew27KzZmjx-Ai5Xk9NPLgCzZg6M2501wjjZ8ItY&s=jdSYaqQcp-DBBW6D0aax4E_qysldCTvWue3iMUemeuw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From jonathan.buzzard at strath.ac.uk Fri Jan 10 10:17:25 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 10 Jan 2020 10:17:25 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From rp2927 at gsb.columbia.edu Fri Jan 10 15:17:50 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Fri, 10 Jan 2020 15:17:50 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> <5300AD58-8879-45FC-9204-54BFC477F05D@gsb.columbia.edu> Message-ID: <81809034-B68D-4E85-8CDD-50B2FE063755@gsb.columbia.edu> __. Yes, I've seen the recommendation in the docs, but failed to see an obvious advantage for my case. I have 4 separate backup jobs (on the same client), for as many filesets, for which I can set separate schedules. I guess (?) I could do the same with the TSM scheduler, but it was simpler this way in the beginning when I setup the system, and nothing pushed me to change it since... __ Razvan -- ?On 1/10/20, 5:17 AM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Jonathan Buzzard" wrote: On Thu, 2020-01-09 at 21:38 +0000, Popescu, Razvan wrote: > Hi Rafael, > > This looks awesomely promising, but I can?t find the info your refer > here. > My SCHEDLOGNAME points to /root/dsmsched.log but there is no > file by that name in /root. I have the error and instrumentation > logs (dsmerror.log and dsminstr.log) per their options, but not the > scheduler. > > Could it be because I don?t run mmbackup via the TSM scheduler ?! > (I run it as a cronjob, inside a little wrapper that takes care of > preparing/deleting a snapshot for it). Must I run the scheduler to > log the activity of the client? > That is not a "recommended" way to do a TSM backup. You should use a schedule where the action is command. See https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/srv.reference/r_cmd_schedule_client_define.html and then set the command to be your script. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From vpuvvada at in.ibm.com Mon Jan 13 07:39:49 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 13 Jan 2020 13:09:49 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=WwGGO3WlGLmgMZX-tb_xjLEk0paAJ_Tekt6NNrxJgPM&s=_oss6YKaJwm5PEi1xqqpwxOstqR0Pqw6hdhOwZ3gsAw&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Jan 13 09:11:39 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 13 Jan 2020 10:11:39 +0100 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Mon Jan 13 11:59:11 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 13 Jan 2020 11:59:11 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> Message-ID: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From rp2927 at gsb.columbia.edu Mon Jan 13 16:02:45 2020 From: rp2927 at gsb.columbia.edu (Popescu, Razvan) Date: Mon, 13 Jan 2020 16:02:45 +0000 Subject: [gpfsug-discuss] Mmbackup -- list files backed up by incremental run In-Reply-To: References: <3F8D00F2-1325-4F40-89B6-0BB86E3CBA2E@gsb.columbia.edu> Message-ID: <5709E6AE-5DD1-46A1-A1B7-C24BF6FFAF84@gsb.columbia.edu> Thanks Uli, I ran the backup with the flag mentioned by {the GPFS team} (thanks again, guys!!) and found the internal list files -- all super fine. I plan to keep that flag in place for a while, to have that info when I might need it (the large files that kept being backed up, and I wanted to trace, just disappeared... __ ) Razvan -- ?On 1/13/20, 4:11 AM, "Ulrich Sibiller" wrote: On 09.01.20 22:19, Popescu, Razvan wrote: > Thanks, > > I?ll set tonight?s run with that debug flag. I have not tested this myself but if you enable auditlogging this should create according logs. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From neil.wilson at metoffice.gov.uk Tue Jan 14 15:27:54 2020 From: neil.wilson at metoffice.gov.uk (Wilson, Neil) Date: Tue, 14 Jan 2020 15:27:54 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. Message-ID: Hi All, We are occasionally seeing an issue where an mmapplypolicy list job gets stuck , all its doing is generating a listing from a fileset. The problem occurs intermittently and doesn't seem to show any particular pattern ( i.e. not always on the same fileset) The policy job shows the usual output but then outputs the following until the process is killed. [I] 2020-01-08 at 03:05:30.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:05:45.471 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:00.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:15.472 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:30.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:06:45.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:00.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:15.473 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:30.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:07:45.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:00.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:15.475 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:30.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:08:45.476 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:00.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:15.477 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:30.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:09:45.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:00.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:15.478 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:30.479 Directory entries scanned: 0. [I] 2020-01-08 at 03:10:45.480 Directory entries scanned: 0. [I] 2020-01-08 at 03:11:00.481 Directory entries scanned: 0. Have any of you come across an issue like this before? Kind regards Neil Neil Wilson? Senior IT Practitioner Storage, Virtualisation and Mainframe Team?? IT Services Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom From vpuvvada at in.ibm.com Tue Jan 14 16:50:17 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Tue, 14 Jan 2020 22:20:17 +0530 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder Is this to be expected and normal behavior? What to do about it? Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=20ckIS70wRlogBxSX5WD9bOfsKUGUmKmBAWo7o3UIxQ&s=vGFKxKzbzDKaO343APy97QWJPnsfSSxhWz8qCVCnVqo&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Tue Jan 14 18:35:05 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Tue, 14 Jan 2020 18:35:05 +0000 Subject: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck. In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:21:12 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:21:12 +0000 Subject: [gpfsug-discuss] London User Group Message-ID: Hi All, Just a date for your diary, the UK/WW/London user group will be taking place 13th/14th May. In addition to this, we?re also running an introductory day on 12th May for those recently acquainted with Spectrum Scale. Please mark the dates in your diary! If you have any topics you would like to hear about in London (or any of the other WW user groups) please let me know. Please also take some time to think about if you could provide a site-update or user talk for the event. The feedback we get is that people want to hear more of these, but we can only do this if you are prepared to volunteer a talk. Everyone has something to say about their site deployment, maybe you want to talk about what you are doing with Scale, how you found deployment, or the challenges you face. Finally, as in the past few years, we are looking for sponsors of the UK event, this funds our evening social/networking event which has been a great success over the past few years as he group has grown in size. I will be contacting companies who have supported us in the past, but please also drop me an email if you are interested in sponsoring the group and I will ensure I share the details of the sponsorship offering with you ? when we advertise sponsorship, it will be offered on a first come, first served basis. Thanks Simon (UK/group chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jan 14 20:25:20 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 14 Jan 2020 20:25:20 +0000 Subject: [gpfsug-discuss] #spectrum-discover Slack channel Message-ID: We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair) -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 15 14:55:53 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 15 Jan 2020 14:55:53 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? Message-ID: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Wed Jan 15 18:30:26 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 18:30:26 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> Message-ID: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Yum generally only wants there to be single version of any package (it is trying to eliminate conflicting provides/depends so that all of the packaging requirements are satisfied). So this alien packaging practice of installing an efix version of a package over the top of the base version is not compatible with yum. The real issue for draconian sysadmins like us (whose systems must use and obey yum) is that there are files (*liblum.so) which are provided by the non-efix RPMS, but are not owned by the packages according to the RPM database since they?re purposefully installed outside of RPM?s tracking mechanism. We work around this by repackaging the three affected RPMS to include the orphaned files from the original RPMs (and eliminating the related but problematic checks from the RPMs? scripts) so that our efix RPMs have been ?un-efix-ified? and will install as expected when using ?yum upgrade?. To my knowledge no one?s published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. IBM isn?t the only vendor who is ?bad at packaging? from a sysadmin?s point of view, but they are the only one which owns RedHat (who are the de facto masters of RPM/YUM/DNF packaging) so this should probably get better one day. ? Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Billich Heinrich Rainer (ID SD) Sent: Wednesday, January 15, 2020 09:56 To: gpfsug main discussion list Subject: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much easier, but I don't see any yum options which match rpm's '--force' option. --force Same as using --replacepkgs, --replacefiles, and --oldpackage. Yum?s ?upgrade? probably is the same as rpm?s ??oldpackage?, but what?s about ?replacepkgs and oldpackage? Of course I can script this in several ways but using yum should be much easier. Thank you, any comments are welcome. Cheers, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Jan 15 19:10:20 2020 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 15 Jan 2020 19:10:20 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <8ee7ad0a895442abb843688936ac4d73@deshaw.com> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> Message-ID: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From kkr at lbl.gov Wed Jan 15 18:20:04 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Wed, 15 Jan 2020 10:20:04 -0800 Subject: [gpfsug-discuss] (Please help with) Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Now there are 27 wonderful people who have completed the poll. I will close it today, EOB. Please take the 2 minutes to fill it out before it closes. https://forms.gle/NFk5q4djJWvmDurW7 Thanks, Kristy > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Wed Jan 15 20:59:33 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 15:59:33 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied).? So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?.? To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=d-mEUJTkUy0f2Cth1wflA_xI_HiCKrrKZ_-SAjf2z5Q&s=wkv8CcIBgPcGbuG-aIGgcWZoZqzb6FvvjmKX-V728wE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From novosirj at rutgers.edu Wed Jan 15 21:10:59 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Wed, 15 Jan 2020 21:10:59 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Message-ID: Hi there, I know some of the Spectrum Scale developers look at this list. I?m having a little trouble with support on this problem. We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM guests with a portability layer that has been installed via gpfs.gplbin RPMs that we built at our site and have used to install GPFS all over our environment. We?ve not seen this problem so far on any physical hosts, but have now experienced it on guests running on number of our KVM hypervisors, across vendors and firmware versions, etc. At one time I thought it was all happening on systems using Mellanox virtual functions for Infiniband, but we?ve now seen it on VMs without VFs. There may be an SELinux interaction, but some of our hosts have it disabled outright, some are Permissive, and some were working successfully with 5.0.2.x GPFS. What I?ve been instructed to try to solve this problem has been to run ?mmbuildgpl?, and it has solved the problem. I don?t consider running "mmbuildgpl" a real solution, however. If RPMs are a supported means of installation, it should work. Support told me that they?d seen this solve the problem at another site as well. Does anyone have any more information about this problem/whether there?s a fix in the pipeline, or something that can be done to cause this problem that we could remedy? Is there an easy place to see a list of eFixes to see if this has come up? I know it?s very similar to a problem that happened I believe it was after 5.0.2.2 and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. Below is a sample of the crash output: [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] [] kfree+0x13c/0x140 [ 156.760749] RSP: 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ 156.775154] [] cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] [] _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvPP10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 [mmfs26] [ 156.779378] [] _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_PcjjjP10ext_cred_t+0x46a/0x7e0 [mmfs26] [ 156.781689] [] ? _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 [mmfs26] [ 156.783565] [] _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 [mmfs26] [ 156.786228] [] _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7FilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 [mmfs26] [ 156.788681] [] ? _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 [mmfs26] [ 156.790448] [] _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVattr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 [mmfs26] [ 156.793032] [] ? _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 [mmfslinux] [ 156.795838] [] ? _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6c0 [mmfs26] [ 156.797753] [] ? __d_alloc+0x122/0x180 [ 156.798763] [] ? d_alloc+0x60/0x70 [ 156.799700] [] lookup_real+0x23/0x60 [ 156.800651] [] __lookup_hash+0x42/0x60 [ 156.801675] [] lookup_slow+0x42/0xa7 [ 156.802634] [] link_path_walk+0x80f/0x8b0 [ 156.803666] [] path_lookupat+0x7a/0x8b0 [ 156.804690] [] ? lru_cache_add+0xe/0x10 [ 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ 156.806766] [] ? getname_flags+0x4f/0x1a0 [ 156.807817] [] filename_lookup+0x2b/0xc0 [ 156.808834] [] user_path_at_empty+0x67/0xc0 [ 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ 156.811017] [] user_path_at+0x11/0x20 [ 156.811983] [] vfs_fstatat+0x63/0xc0 [ 156.812951] [] SYSC_newstat+0x2e/0x60 [ 156.813931] [] ? trace_do_page_fault+0x56/0x150 [ 156.815050] [] SyS_newstat+0xe/0x10 [ 156.816010] [] system_call_fastpath+0x25/0x2a [ 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 [ 156.822192] RIP [] kfree+0x13c/0x140 [ 156.823180] RSP [ 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From Paul.Sanchez at deshaw.com Wed Jan 15 22:35:23 2020 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 15 Jan 2020 22:35:23 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: This reminds me that there is one more thing which drives the convoluted process I described earlier? Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that?s not the case for everyone.) -Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale Sent: Wednesday, January 15, 2020 16:00 To: gpfsug main discussion list Cc: gpfsug-discuss-bounces at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? This message was sent by an external party. >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From scale at us.ibm.com Wed Jan 15 23:50:50 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Wed, 15 Jan 2020 18:50:50 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: When requesting efix, you can inform the service personnel that you need efix RPMs which don't have dependencies on the base-version. Our service team should be able to provide the appropriate efix RPMs that meet your needs. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/15/2020 05:35:23 PM: > From: "Sanchez, Paul" > To: gpfsug main discussion list > Cc: "gpfsug-discuss-bounces at spectrumscale.org" bounces at spectrumscale.org> > Date: 01/15/2020 05:34 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > This reminds me that there is one more thing which drives the > convoluted process I described earlier? > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efixRPMS and > in that order. IIRC, there were no RPM dependencies connecting the > efix RPMs to their base-version equivalents, so there was nothing to > signal YUM that installing the efix requires that the base-version > be installed first. > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that?s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > This message was sent by an external party. > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have > incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please > contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > The forum is informally monitored as time permits and should not be > used for priority messages to the Spectrum Scale (GPFS) team. > > [image removed] Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/ > 01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there > to be single version of a > > From: Jonathan Buzzard > To: "gpfsug-discuss at spectrumscale.org" > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they?re purposefully installed outside of RPM?s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs? scripts) so that our efix RPMs > > have been ?un-efix-ified? and will install as expected when using ?yum > > upgrade?. To my knowledge no one?s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=aSasL0r- > NxIT9nDrkoQO6rcyV88VUM_oc6mYssN-_Ng&s=4- > wB8cR24x2P7Rpn_14fIXuwxCvvqwne7xcIp85dZoI&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From frankli at us.ibm.com Thu Jan 16 04:47:43 2020 From: frankli at us.ibm.com (Frank N Lee) Date: Wed, 15 Jan 2020 22:47:43 -0600 Subject: [gpfsug-discuss] #spectrum-discover Slack channel In-Reply-To: References: Message-ID: Simon, Thanks for launching this Slack channel! Copying some of my colleagues who works with Discover. Frank Frank Lee, PhD IBM Systems Group 314-482-5329 | @drfranknlee From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 01/14/2020 02:25 PM Subject: [EXTERNAL] [gpfsug-discuss] #spectrum-discover Slack channel Sent by: gpfsug-discuss-bounces at spectrumscale.org We?ve today added a new Slack channel to the SSUG/PowerAI ug slack community ?#spectrum-discover?, whilst we know that a lot of the people using Spectrum Discover are Spectrum Scale users, we welcome all discussion of Discover on the Slack channel, not just those using Spectrum Scale. As with the #spectrum-scale and #powerai channels, IBM are working to ensure there are appropriate people on the channel to help with discussion/queries. If you are not already a member of the Slack community, please visit www.spectrumscaleug.org/join for details. Thanks Simon (UK/chair)_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HIs14G9Qcs5MqpsAFL5E0TH5hqFD-KbquYdQ_mTmTnI&m=ncahEw2s7R3QIxk7C4IZw2JyOd4_8dFtsAueY6L6dF8&s=fTL2YhTgik5-QpcxEHpoJLO5A9FfOF2ZyNK09_Zxfbc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From phgrau at zedat.fu-berlin.de Thu Jan 16 10:51:43 2020 From: phgrau at zedat.fu-berlin.de (Philipp Grau) Date: Thu, 16 Jan 2020 11:51:43 +0100 Subject: [gpfsug-discuss] Welcome to the "gpfsug-discuss" mailing list In-Reply-To: References: Message-ID: <20200116105143.GA278757@CIS.FU-Berlin.DE> Hello, as requested: * gpfsug-discuss-request at spectrumscale.org [15.01.20 13:40]: > Please introduce yourself to the members with your first post. I'm Philipp from Berlin, Germany. The IT department of the "Freie Universit?t Berlin" is working place. We have a DDN-System with some PB of storage, and GPFS nodes for exporting the space. The use case is "scientific storage", reseach data and stuff (no home or group shares). Regards, Philipp From knop at us.ibm.com Thu Jan 16 13:41:58 2020 From: knop at us.ibm.com (Felipe Knop) Date: Thu, 16 Jan 2020 13:41:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Thu Jan 16 15:32:27 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Thu, 16 Jan 2020 15:32:27 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > To: "gpfsug-discuss at spectrumscale.org" > > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package (it > > is trying to eliminate conflicting provides/depends so that all of the > > packaging requirements are satisfied). So this alien packaging practice > > of installing an efix version of a package over the top of the base > > version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow one > to know which version of GPFS you happen to have installed on a node > without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must use > > and obey yum) is that there are files (*liblum.so) which are provided by > > the non-efix RPMS, but are not owned by the packages according to the > > RPM database since they???re purposefully installed outside of RPM???s > > tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf then > start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you purge > the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to include > > the orphaned files from the original RPMs (and eliminating the related > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > have been ???un-efix-ified??? and will install as expected when using ???yum > > upgrade???. To my knowledge no one???s published a way to do this, so we > > all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From bbanister at jumptrading.com Thu Jan 16 17:12:04 2020 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 16 Jan 2020 17:12:04 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: We actually add an ExecStartPre directive override (e.g. /etc/systemd/system/gpfs.service.d/gpfs.service.conf) to the gpfs.service [Service] section that points to a simple script that does a check of the GPFS RPMs installed on the system and updates them to what our config management specifies should be installed (a simple txt file in /etc/sysconfig namespace), which ensures that GPFS RPMs are updated before GPFS is started, while GPFS is still down. Works very well for us. The script also does some other checks and updates too, such as adding the node into the right GPFS cluster if needed. Hope that helps, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Skylar Thompson Sent: Thursday, January 16, 2020 9:32 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to install efix with yum ? [EXTERNAL EMAIL] Another problem we've run into with automating GPFS installs/upgrades is that the gplbin (kernel module) packages have a post-install script that will unmount the filesystem *even if the package isn't for the running kernel*. We needed to write some custom reporting in our configuration management system to only install gplbin if GPFS was already stopped on the node. On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > This reminds me that there is one more thing which drives the convoluted process I described earlier??? > > Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. > > (Our particular case is worse than just this though, since we prohibit > installing two versions/releases for the same (non-kernel) package > name. But that???s not the case for everyone.) > > -Paul > > From: gpfsug-discuss-bounces at spectrumscale.org > On Behalf Of IBM Spectrum > Scale > Sent: Wednesday, January 15, 2020 16:00 > To: gpfsug main discussion list > Cc: gpfsug-discuss-bounces at spectrumscale.org > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > This message was sent by an external party. > > > >> I don't see any yum options which match rpm's '--force' option. > Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. > > Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. > > Regards, The Spectrum Scale (GPFS) team > > ---------------------------------------------------------------------- > -------------------------------------------- > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://urldefense.com/v3/__https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4kUZwbFD$ . > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan > Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul > wrote: > Yum generally only wants there to be single version of a > > From: Jonathan Buzzard > > > To: > "gpfsug-discuss at spectrumscale.org org>" > org>> > Date: 01/15/2020 02:09 PM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: > gpfsug-discuss-bounces at spectrumscale.org @spectrumscale.org> > > ________________________________ > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > Yum generally only wants there to be single version of any package > > (it is trying to eliminate conflicting provides/depends so that all > > of the packaging requirements are satisfied). So this alien > > packaging practice of installing an efix version of a package over > > the top of the base version is not compatible with yum. > > I would at this juncture note that IBM should be appending the efix > number to the RPM so that for example > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > which would firstly make the problem go away, and second would allow > one to know which version of GPFS you happen to have installed on a > node without doing some sort of voodoo. > > > > > The real issue for draconian sysadmins like us (whose systems must > > use and obey yum) is that there are files (*liblum.so) which are > > provided by the non-efix RPMS, but are not owned by the packages > > according to the RPM database since they???re purposefully installed > > outside of RPM???s tracking mechanism. > > > > It worse than that because if you install the RPM directly yum/dnf > then start bitching about the RPM database being modified outside of > themselves and all sorts of useful information gets lost when you > purge the package installation history to make the error go away. > > > We work around this by repackaging the three affected RPMS to > > include the orphaned files from the original RPMs (and eliminating > > the related but problematic checks from the RPMs??? scripts) so that > > our efix RPMs have been ???un-efix-ified??? and will install as > > expected when using ???yum upgrade???. To my knowledge no one???s > > published a way to do this, so we all just have to figure this out and run rpmrebuild for ourselves. > > > > IBM should be hanging their heads in shame if the replacement RPM is > missing files. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug- > discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBS > vTjEZ8ejU_A8Ys5RT4p8oUpuH$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!GSt_xZU7050wKg!_4v0rLABhTuazN2Q8qJhS71K6k5UYQXKY1twvbP4TBSvTjEZ8ejU_A8Ys5RT4p8oUpuH$ From novosirj at rutgers.edu Thu Jan 16 21:31:57 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Thu, 16 Jan 2020 21:31:57 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- From scale at us.ibm.com Thu Jan 16 22:59:14 2020 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 16 Jan 2020 17:59:14 -0500 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch><8ee7ad0a895442abb843688936ac4d73@deshaw.com><3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install gplbin without stopping GPFS by using the following step: Build gpfs.gplbin using mmbuildgpl --build-packge Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin package with rpm -i gpfs.gplbin*.rpm Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > From: Skylar Thompson > To: gpfsug-discuss at spectrumscale.org > Date: 01/16/2020 10:35 AM > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > This reminds me that there is one more thing which drives the > convoluted process I described earlier??? > > > > Automation. Deployment solutions which use yum to build new hosts > are often the place where one notices the problem. They would need > to determine that they should install both the base-version and efix > RPMS and in that order. IIRC, there were no RPM dependencies > connecting the efix RPMs to their base-version equivalents, so > there was nothing to signal YUM that installing the efix requires > that the base-version be installed first. > > > > (Our particular case is worse than just this though, since we > prohibit installing two versions/releases for the same (non-kernel) > package name. But that???s not the case for everyone.) > > > > -Paul > > > > From: gpfsug-discuss-bounces at spectrumscale.org bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > Sent: Wednesday, January 15, 2020 16:00 > > To: gpfsug main discussion list > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > This message was sent by an external party. > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > Actually, you do not need to use --force option since efix RPMs > have incremental efix number in rpm name. > > > > Efix package provides update RPMs to be installed on top of > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > 0.4.1 is already installed on your system, "yum update" should work. > > > > Regards, The Spectrum Scale (GPFS) team > > > > > ------------------------------------------------------------------------------------------------------------------ > > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > > > If your query concerns a potential software error in Spectrum > Scale (GPFS) and you have an IBM software maintenance contract > please contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > > > The forum is informally monitored as time permits and should not > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > single version of a > > > > From: Jonathan Buzzard mailto:jonathan.buzzard at strath.ac.uk>> > > To: "gpfsug-discuss at spectrumscale.org discuss at spectrumscale.org>" mailto:gpfsug-discuss at spectrumscale.org>> > > Date: 01/15/2020 02:09 PM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org discuss-bounces at spectrumscale.org> > > > > ________________________________ > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > Yum generally only wants there to be single version of any package (it > > > is trying to eliminate conflicting provides/depends so that all of the > > > packaging requirements are satisfied). So this alien packaging practice > > > of installing an efix version of a package over the top of the base > > > version is not compatible with yum. > > > > I would at this juncture note that IBM should be appending the efix > > number to the RPM so that for example > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > which would firstly make the problem go away, and second would allow one > > to know which version of GPFS you happen to have installed on a node > > without doing some sort of voodoo. > > > > > > > > The real issue for draconian sysadmins like us (whose systems must use > > > and obey yum) is that there are files (*liblum.so) which are provided by > > > the non-efix RPMS, but are not owned by the packages according to the > > > RPM database since they???re purposefully installed outside of RPM???s > > > tracking mechanism. > > > > > > > It worse than that because if you install the RPM directly yum/dnf then > > start bitching about the RPM database being modified outside of > > themselves and all sorts of useful information gets lost when you purge > > the package installation history to make the error go away. > > > > > We work around this by repackaging the three affected RPMS to include > > > the orphaned files from the original RPMs (and eliminating the related > > > but problematic checks from the RPMs??? scripts) so that our efix RPMs > > > have been ???un-efix-ified??? and will install as expected when > using ???yum > > > upgrade???. To my knowledge no one???s published a way to do this, so we > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > missing files. > > > > JAB. > > > > -- > > Jonathan A. Buzzard Tel: +44141-5483420 > > HPC System Administrator, ARCHIE-WeSt. > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > -- > -- Skylar Thompson (skylar2 at u.washington.edu) > -- Genome Sciences Department, System Administrator > -- Foege Building S046, (206)-685-7354 > -- University of Washington School of Medicine > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 02:20:29 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 02:20:29 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <47921BA1-A20B-4B55-876D-A26C082496BE@rutgers.edu> Thank you for the reminder. I?ve received that nasty surprise myself, but just long ago enough to have forgotten it. Would love to see that fixed. > On Jan 16, 2020, at 10:32 AM, Skylar Thompson wrote: > > Another problem we've run into with automating GPFS installs/upgrades is > that the gplbin (kernel module) packages have a post-install script that > will unmount the filesystem *even if the package isn't for the running > kernel*. We needed to write some custom reporting in our configuration > management system to only install gplbin if GPFS was already stopped on the > node. > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: >> This reminds me that there is one more thing which drives the convoluted process I described earlier??? >> >> Automation. Deployment solutions which use yum to build new hosts are often the place where one notices the problem. They would need to determine that they should install both the base-version and efix RPMS and in that order. IIRC, there were no RPM dependencies connecting the efix RPMs to their base-version equivalents, so there was nothing to signal YUM that installing the efix requires that the base-version be installed first. >> >> (Our particular case is worse than just this though, since we prohibit installing two versions/releases for the same (non-kernel) package name. But that???s not the case for everyone.) >> >> -Paul >> >> From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of IBM Spectrum Scale >> Sent: Wednesday, January 15, 2020 16:00 >> To: gpfsug main discussion list >> Cc: gpfsug-discuss-bounces at spectrumscale.org >> Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> >> >> This message was sent by an external party. >> >> >>>> I don't see any yum options which match rpm's '--force' option. >> Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. >> >> Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. >> >> Regards, The Spectrum Scale (GPFS) team >> >> ------------------------------------------------------------------------------------------------------------------ >> If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. >> >> If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. >> >> The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. >> >> [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a >> >> From: Jonathan Buzzard > >> To: "gpfsug-discuss at spectrumscale.org" > >> Date: 01/15/2020 02:09 PM >> Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> ________________________________ >> >> >> >> On 15/01/2020 18:30, Sanchez, Paul wrote: >>> Yum generally only wants there to be single version of any package (it >>> is trying to eliminate conflicting provides/depends so that all of the >>> packaging requirements are satisfied). So this alien packaging practice >>> of installing an efix version of a package over the top of the base >>> version is not compatible with yum. >> >> I would at this juncture note that IBM should be appending the efix >> number to the RPM so that for example >> >> gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 >> >> which would firstly make the problem go away, and second would allow one >> to know which version of GPFS you happen to have installed on a node >> without doing some sort of voodoo. >> >>> >>> The real issue for draconian sysadmins like us (whose systems must use >>> and obey yum) is that there are files (*liblum.so) which are provided by >>> the non-efix RPMS, but are not owned by the packages according to the >>> RPM database since they???re purposefully installed outside of RPM???s >>> tracking mechanism. >>> >> >> It worse than that because if you install the RPM directly yum/dnf then >> start bitching about the RPM database being modified outside of >> themselves and all sorts of useful information gets lost when you purge >> the package installation history to make the error go away. >> >>> We work around this by repackaging the three affected RPMS to include >>> the orphaned files from the original RPMs (and eliminating the related >>> but problematic checks from the RPMs??? scripts) so that our efix RPMs >>> have been ???un-efix-ified??? and will install as expected when using ???yum >>> upgrade???. To my knowledge no one???s published a way to do this, so we >>> all just have to figure this out and run rpmrebuild for ourselves. >>> >> >> IBM should be hanging their heads in shame if the replacement RPM is >> missing files. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' From knop at us.ibm.com Fri Jan 17 15:35:19 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 15:35:19 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Fri Jan 17 15:42:57 2020 From: skylar2 at uw.edu (Skylar Thompson) Date: Fri, 17 Jan 2020 15:42:57 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> <20200116153227.6v7o6awta5yy32tj@utumno.gs.washington.edu> Message-ID: <20200117154257.45ioc4ugw7dvuwym@utumno.gs.washington.edu> Thanks for the pointer! We're in the process of upgrading from 4.2.3-6 to 4.2.3-19 so I'll make a note that we should start setting that environment variable when we build gplbin. On Thu, Jan 16, 2020 at 05:59:14PM -0500, IBM Spectrum Scale wrote: > On Spectrum Scale 4.2.3.15 or later and 5.0.2.2 or later, you can install > gplbin without stopping GPFS by using the following step: > > Build gpfs.gplbin using mmbuildgpl --build-packge > Set environment variable MM_INSTALL_ONLY to 1 before install gpfs.gplbin > package with rpm -i gpfs.gplbin*.rpm > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > . > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > gpfsug-discuss-bounces at spectrumscale.org wrote on 01/16/2020 10:32:27 AM: > > > From: Skylar Thompson > > To: gpfsug-discuss at spectrumscale.org > > Date: 01/16/2020 10:35 AM > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Another problem we've run into with automating GPFS installs/upgrades is > > that the gplbin (kernel module) packages have a post-install script that > > will unmount the filesystem *even if the package isn't for the running > > kernel*. We needed to write some custom reporting in our configuration > > management system to only install gplbin if GPFS was already stopped on > the > > node. > > > > On Wed, Jan 15, 2020 at 10:35:23PM +0000, Sanchez, Paul wrote: > > > This reminds me that there is one more thing which drives the > > convoluted process I described earlier??? > > > > > > Automation. Deployment solutions which use yum to build new hosts > > are often the place where one notices the problem. They would need > > to determine that they should install both the base-version and efix > > RPMS and in that order. IIRC, there were no RPM dependencies > > connecting the efix RPMs to their base-version equivalents, so > > there was nothing to signal YUM that installing the efix requires > > that the base-version be installed first. > > > > > > (Our particular case is worse than just this though, since we > > prohibit installing two versions/releases for the same (non-kernel) > > package name. But that???s not the case for everyone.) > > > > > > -Paul > > > > > > From: gpfsug-discuss-bounces at spectrumscale.org > bounces at spectrumscale.org> On Behalf Of IBM Spectrum Scale > > > Sent: Wednesday, January 15, 2020 16:00 > > > To: gpfsug main discussion list > > > Cc: gpfsug-discuss-bounces at spectrumscale.org > > > Subject: Re: [gpfsug-discuss] How to install efix with yum ? > > > > > > > > > This message was sent by an external party. > > > > > > > > > >> I don't see any yum options which match rpm's '--force' option. > > > Actually, you do not need to use --force option since efix RPMs > > have incremental efix number in rpm name. > > > > > > Efix package provides update RPMs to be installed on top of > > corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5. > > 0.4.1 is already installed on your system, "yum update" should work. > > > > > > Regards, The Spectrum Scale (GPFS) team > > > > > > > > > ------------------------------------------------------------------------------------------------------------------ > > > If you feel that your question can benefit other users of Spectrum > > Scale (GPFS), then please post it to the public IBM developerWroks Forum > at > > https://www.ibm.com/developerworks/community/forums/html/forum? > > id=11111111-0000-0000-0000-000000000479. > > > > > > If your query concerns a potential software error in Spectrum > > Scale (GPFS) and you have an IBM software maintenance contract > > please contact 1-800-237-5511 in the United States or your local IBM > > Service Center in other countries. > > > > > > The forum is informally monitored as time permits and should not > > be used for priority messages to the Spectrum Scale (GPFS) team. > > > > > > [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 > > PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum > > generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 > > 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be > > single version of a > > > > > > From: Jonathan Buzzard > mailto:jonathan.buzzard at strath.ac.uk>> > > > To: "gpfsug-discuss at spectrumscale.org > discuss at spectrumscale.org>" > mailto:gpfsug-discuss at spectrumscale.org>> > > > Date: 01/15/2020 02:09 PM > > > Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum > ? > > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > discuss-bounces at spectrumscale.org> > > > > > > ________________________________ > > > > > > > > > > > > On 15/01/2020 18:30, Sanchez, Paul wrote: > > > > Yum generally only wants there to be single version of any package > (it > > > > is trying to eliminate conflicting provides/depends so that all of > the > > > > packaging requirements are satisfied). So this alien packaging > practice > > > > of installing an efix version of a package over the top of the base > > > > version is not compatible with yum. > > > > > > I would at this juncture note that IBM should be appending the efix > > > number to the RPM so that for example > > > > > > gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 > > > > > > which would firstly make the problem go away, and second would allow > one > > > to know which version of GPFS you happen to have installed on a node > > > without doing some sort of voodoo. > > > > > > > > > > > The real issue for draconian sysadmins like us (whose systems must > use > > > > and obey yum) is that there are files (*liblum.so) which are > provided by > > > > the non-efix RPMS, but are not owned by the packages according to > the > > > > RPM database since they???re purposefully installed outside of > RPM???s > > > > tracking mechanism. > > > > > > > > > > It worse than that because if you install the RPM directly yum/dnf > then > > > start bitching about the RPM database being modified outside of > > > themselves and all sorts of useful information gets lost when you > purge > > > the package installation history to make the error go away. > > > > > > > We work around this by repackaging the three affected RPMS to > include > > > > the orphaned files from the original RPMs (and eliminating the > related > > > > but problematic checks from the RPMs??? scripts) so that our efix > RPMs > > > > have been ???un-efix-ified??? and will install as expected when > > using ???yum > > > > upgrade???. To my knowledge no one???s published a way to do this, > so we > > > > all just have to figure this out and run rpmrebuild for ourselves. > > > > > > > > > > IBM should be hanging their heads in shame if the replacement RPM is > > > missing files. > > > > > > JAB. > > > > > > -- > > > Jonathan A. Buzzard Tel: +44141-5483420 > > > HPC System Administrator, ARCHIE-WeSt. > > > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > > -- > > -- Skylar Thompson (skylar2 at u.washington.edu) > > -- Genome Sciences Department, System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- University of Washington School of Medicine > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://urldefense.proofpoint.com/v2/url? > > > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > > siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JuqnpjbB7dT517f- > > > YI9EzaM_C0i4QvKJIJn_Vsre80k&s=T9L8T-cXzxzJGTWfpHOTFoExTltGDVXmHFuv9_Jeyjo&e= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From novosirj at rutgers.edu Fri Jan 17 15:55:54 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 15:55:54 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , Message-ID: That /is/ interesting. I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Jan 17, 2020, at 10:35, Felipe Knop wrote: ? Hi Ryan, Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. This, at least, seems to make sense, in terms of matching to the symptoms of the problem. We are still in internal debates on whether/how update our guidelines for gplbin generation ... Regards, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 (845) 433-9314 T/L 293-9314 ----- Original message ----- From: Ryan Novosielski Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug-discuss at spectrumscale.org" Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests Date: Thu, Jan 16, 2020 4:33 PM -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Felipe, I either misunderstood support or convinced them to take further action. It at first looked like they were suggesting "mmbuildgpl fixed it: case closed" (I know they wanted to close the SalesForce case anyway, which would prevent communication on the issue). At this point, they've asked for a bunch more information. Support is asking similar questions re: the speculations, and I'll provide them with the relevant output ASAP, but I did confirm all of that, including that there were no stray mmfs26/tracedev kernel modules anywhere else in the relevant /lib/modules PATHs. In the original case, I built on a machine running 3.10.0-957.27.2, but pointed to the 3.10.0-1062.9.1 source code/defined the relevant portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked before, and rebuilding once the build system was running 3.10.0-1062.9.1 as well did not change anything either. In all cases, the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If you build against either the wrong kernel version or the wrong GPFS version, both will appear right in the filename of the gpfs.gplbin RPM you build. Mine is called: gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm Anyway, thanks for your response; I know you might not be following/working on this directly, but I figured the extra info might be of interest. On 1/16/20 8:41 AM, Felipe Knop wrote: > Hi Ryan, > > I'm aware of this ticket, and I understand that there has been > active communication with the service team on this problem. > > The crash itself, as you indicate, looks like a problem that has > been fixed: > > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 > > The fact that the problem goes away when *mmbuildgpl* is issued > appears to point to some incompatibility with kernel levels and/or > Scale version levels. Just speculating, some possible areas may > be: > > > * The RPM might have been built on a version of Scale without the > fix * The RPM might have been built on a different (minor) version > of the kernel * Somehow the VM picked a "leftover" GPFS kernel > module, as opposed to the one included in gpfs.gplbin -- given > that mmfsd never complained about a missing GPL kernel module > > > Felipe > > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > > ----- Original message ----- From: Ryan Novosielski > Sent by: > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion > list Cc: Subject: [EXTERNAL] > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM > guests Date: Wed, Jan 15, 2020 4:11 PM > > Hi there, > > I know some of the Spectrum Scale developers look at this list. > I?m having a little trouble with support on this problem. > > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM > guests with a portability layer that has been installed via > gpfs.gplbin RPMs that we built at our site and have used to > install GPFS all over our environment. We?ve not seen this problem > so far on any physical hosts, but have now experienced it on guests > running on number of our KVM hypervisors, across vendors and > firmware versions, etc. At one time I thought it was all happening > on systems using Mellanox virtual functions for Infiniband, but > we?ve now seen it on VMs without VFs. There may be an SELinux > interaction, but some of our hosts have it disabled outright, some > are Permissive, and some were working successfully with 5.0.2.x > GPFS. > > What I?ve been instructed to try to solve this problem has been to > run ?mmbuildgpl?, and it has solved the problem. I don?t consider > running "mmbuildgpl" a real solution, however. If RPMs are a > supported means of installation, it should work. Support told me > that they?d seen this solve the problem at another site as well. > > Does anyone have any more information about this problem/whether > there?s a fix in the pipeline, or something that can be done to > cause this problem that we could remedy? Is there an easy place to > see a list of eFixes to see if this has come up? I know it?s very > similar to a problem that happened I believe it was after 5.0.2.2 > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. > > Below is a sample of the crash output: > > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] > [] kfree+0x13c/0x140 [ 156.760749] RSP: > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ > 156.775154] [] > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] > [] > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 > > [mmfs26] > [ 156.779378] [] > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P cjjjP10ext_cred_t+0x46a/0x7e0 > > [mmfs26] > [ 156.781689] [] ? > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 > > [mmfs26] > [ 156.783565] [] > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 > > [mmfs26] > [ 156.786228] [] > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 > > [mmfs26] > [ 156.788681] [] ? > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 > [mmfs26] [ 156.790448] [] > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 > > [mmfs26] > [ 156.793032] [] ? > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 > [mmfslinux] [ 156.795838] [] ? > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 c0 > > [mmfs26] > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ > 156.798763] [] ? d_alloc+0x60/0x70 [ > 156.799700] [] lookup_real+0x23/0x60 [ > 156.800651] [] __lookup_hash+0x42/0x60 [ > 156.801675] [] lookup_slow+0x42/0xa7 [ > 156.802634] [] link_path_walk+0x80f/0x8b0 [ > 156.803666] [] path_lookupat+0x7a/0x8b0 [ > 156.804690] [] ? lru_cache_add+0xe/0x10 [ > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ > 156.807817] [] filename_lookup+0x2b/0xc0 [ > 156.808834] [] user_path_at_empty+0x67/0xc0 [ > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ > 156.811017] [] user_path_at+0x11/0x20 [ > 156.811983] [] vfs_fstatat+0x63/0xc0 [ > 156.812951] [] SYSC_newstat+0x2e/0x60 [ > 156.813931] [] ? trace_do_page_fault+0x56/0x150 > [ 156.815050] [] SyS_newstat+0xe/0x10 [ > 156.816010] [] system_call_fastpath+0x25/0x2a [ > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 > 41 55 41 54 [ 156.822192] RIP [] > kfree+0x13c/0x140 [ 156.823180] RSP [ > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: > 0xffffffff80000000-0xffffffffbfffffff) > > -- ____ || \\UTGERS, > |---------------------------*O*--------------------------- ||_// > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus || \\ of NJ | Office of Advanced Research Computing - > MSB C630, Newark `' > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ gpfsug-discuss > mailing list gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > - -- ____ || \\UTGERS, |----------------------*O*------------------------ ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark `' -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= =9rKb -----END PGP SIGNATURE----- _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Fri Jan 17 16:36:01 2020 From: knop at us.ibm.com (Felipe Knop) Date: Fri, 17 Jan 2020 16:36:01 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: , , , Message-ID: An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Jan 17 16:58:58 2020 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Jan 2020 16:58:58 +0000 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: References: Message-ID: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: [root at master config]# diff env.mcr env.mcr-1062.9.1 4,5c4,5 < #define LINUX_KERNEL_VERSION 31000999 < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 --- > #define LINUX_KERNEL_VERSION 31001062 > #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: > > Hi Ryan, > > My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. > > The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. > > Felipe > > ---- > Felipe Knop knop at us.ibm.com > GPFS Development and Security > IBM Systems > IBM Building 008 > 2455 South Rd, Poughkeepsie, NY 12601 > (845) 433-9314 T/L 293-9314 > > > > ----- Original message ----- > From: Ryan Novosielski > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests > Date: Fri, Jan 17, 2020 10:56 AM > > That /is/ interesting. > > I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >> >> ? >> Hi Ryan, >> >> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >> >> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >> >> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >> >> Regards, >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "gpfsug-discuss at spectrumscale.org" >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Thu, Jan 16, 2020 4:33 PM >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi Felipe, >> >> I either misunderstood support or convinced them to take further >> action. It at first looked like they were suggesting "mmbuildgpl fixed >> it: case closed" (I know they wanted to close the SalesForce case >> anyway, which would prevent communication on the issue). At this >> point, they've asked for a bunch more information. >> >> Support is asking similar questions re: the speculations, and I'll >> provide them with the relevant output ASAP, but I did confirm all of >> that, including that there were no stray mmfs26/tracedev kernel >> modules anywhere else in the relevant /lib/modules PATHs. In the >> original case, I built on a machine running 3.10.0-957.27.2, but >> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >> before, and rebuilding once the build system was running >> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >> you build against either the wrong kernel version or the wrong GPFS >> version, both will appear right in the filename of the gpfs.gplbin RPM >> you build. Mine is called: >> >> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >> >> Anyway, thanks for your response; I know you might not be >> following/working on this directly, but I figured the extra info might >> be of interest. >> >> On 1/16/20 8:41 AM, Felipe Knop wrote: >> > Hi Ryan, >> > >> > I'm aware of this ticket, and I understand that there has been >> > active communication with the service team on this problem. >> > >> > The crash itself, as you indicate, looks like a problem that has >> > been fixed: >> > >> > https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >> > >> > The fact that the problem goes away when *mmbuildgpl* is issued >> > appears to point to some incompatibility with kernel levels and/or >> > Scale version levels. Just speculating, some possible areas may >> > be: >> > >> > >> > * The RPM might have been built on a version of Scale without the >> > fix * The RPM might have been built on a different (minor) version >> > of the kernel * Somehow the VM picked a "leftover" GPFS kernel >> > module, as opposed to the one included in gpfs.gplbin -- given >> > that mmfsd never complained about a missing GPL kernel module >> > >> > >> > Felipe >> > >> > ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >> > Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >> > (845) 433-9314 T/L 293-9314 >> > >> > >> > >> > >> > ----- Original message ----- From: Ryan Novosielski >> > Sent by: >> > gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >> > list Cc: Subject: [EXTERNAL] >> > [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >> > Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >> > guests Date: Wed, Jan 15, 2020 4:11 PM >> > >> > Hi there, >> > >> > I know some of the Spectrum Scale developers look at this list. >> > I?m having a little trouble with support on this problem. >> > >> > We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >> > guests with a portability layer that has been installed via >> > gpfs.gplbin RPMs that we built at our site and have used to >> > install GPFS all over our environment. We?ve not seen this problem >> > so far on any physical hosts, but have now experienced it on guests >> > running on number of our KVM hypervisors, across vendors and >> > firmware versions, etc. At one time I thought it was all happening >> > on systems using Mellanox virtual functions for Infiniband, but >> > we?ve now seen it on VMs without VFs. There may be an SELinux >> > interaction, but some of our hosts have it disabled outright, some >> > are Permissive, and some were working successfully with 5.0.2.x >> > GPFS. >> > >> > What I?ve been instructed to try to solve this problem has been to >> > run ?mmbuildgpl?, and it has solved the problem. I don?t consider >> > running "mmbuildgpl" a real solution, however. If RPMs are a >> > supported means of installation, it should work. Support told me >> > that they?d seen this solve the problem at another site as well. >> > >> > Does anyone have any more information about this problem/whether >> > there?s a fix in the pipeline, or something that can be done to >> > cause this problem that we could remedy? Is there an easy place to >> > see a list of eFixes to see if this has come up? I know it?s very >> > similar to a problem that happened I believe it was after 5.0.2.2 >> > and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >> > >> > Below is a sample of the crash output: >> > >> > [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >> > opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >> > ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >> > mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >> > iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >> > mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >> > ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >> > ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >> > iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >> > xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >> > iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >> > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >> > ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >> > virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >> > mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >> > pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >> > crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >> > devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >> > virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >> > 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >> > ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >> > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >> > 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >> > ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >> > [] kfree+0x13c/0x140 [ 156.760749] RSP: >> > 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >> > 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >> > 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >> > ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >> > 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >> > 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >> > 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >> > ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >> > GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >> > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >> > 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >> > 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> > 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >> > 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >> > 156.775154] [] >> > cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >> > [] >> > _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >> > >> > >> [mmfs26] >> > [ 156.779378] [] >> > _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >> cjjjP10ext_cred_t+0x46a/0x7e0 >> > >> > >> [mmfs26] >> > [ 156.781689] [] ? >> > _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >> > >> > >> [mmfs26] >> > [ 156.783565] [] >> > _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >> > >> > >> [mmfs26] >> > [ 156.786228] [] >> > _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >> > >> > >> [mmfs26] >> > [ 156.788681] [] ? >> > _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >> > [mmfs26] [ 156.790448] [] >> > _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >> > >> > >> [mmfs26] >> > [ 156.793032] [] ? >> > _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >> > 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >> > [mmfslinux] [ 156.795838] [] ? >> > _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >> c0 >> > >> > >> [mmfs26] >> > [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >> > 156.798763] [] ? d_alloc+0x60/0x70 [ >> > 156.799700] [] lookup_real+0x23/0x60 [ >> > 156.800651] [] __lookup_hash+0x42/0x60 [ >> > 156.801675] [] lookup_slow+0x42/0xa7 [ >> > 156.802634] [] link_path_walk+0x80f/0x8b0 [ >> > 156.803666] [] path_lookupat+0x7a/0x8b0 [ >> > 156.804690] [] ? lru_cache_add+0xe/0x10 [ >> > 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >> > 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >> > 156.807817] [] filename_lookup+0x2b/0xc0 [ >> > 156.808834] [] user_path_at_empty+0x67/0xc0 [ >> > 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >> > 156.811017] [] user_path_at+0x11/0x20 [ >> > 156.811983] [] vfs_fstatat+0x63/0xc0 [ >> > 156.812951] [] SYSC_newstat+0x2e/0x60 [ >> > 156.813931] [] ? trace_do_page_fault+0x56/0x150 >> > [ 156.815050] [] SyS_newstat+0xe/0x10 [ >> > 156.816010] [] system_call_fastpath+0x25/0x2a [ >> > 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >> > df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >> > e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >> > 41 55 41 54 [ 156.822192] RIP [] >> > kfree+0x13c/0x140 [ 156.823180] RSP [ >> > 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >> > Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >> > Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >> > 0xffffffff80000000-0xffffffffbfffffff) >> > >> > -- ____ || \\UTGERS, >> > |---------------------------*O*--------------------------- ||_// >> > the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >> > University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >> > Campus || \\ of NJ | Office of Advanced Research Computing - >> > MSB C630, Newark `' >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> > >> > >> > >> > >> > _______________________________________________ gpfsug-discuss >> > mailing list gpfsug-discuss at spectrumscale.org >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > >> >> - -- >> ____ >> || \\UTGERS, |----------------------*O*------------------------ >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >> `' >> -----BEGIN PGP SIGNATURE----- >> >> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >> =9rKb >> -----END PGP SIGNATURE----- >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From ulmer at ulmer.org Fri Jan 17 17:39:32 2020 From: ulmer at ulmer.org (Stephen Ulmer) Date: Fri, 17 Jan 2020 12:39:32 -0500 Subject: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests In-Reply-To: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> References: <45A9C66F-57E2-4EDA-B5AE-5775980DF88C@rutgers.edu> Message-ID: Having a sanctioned way to compile targeting a version of the kernel that is installed ? but not running ? would be helpful in many circumstances. ? Stephen > On Jan 17, 2020, at 11:58 AM, Ryan Novosielski wrote: > > Yeah, support got back to me with a similar response earlier today that I?d not seen yet that made it a lot clearer what I ?did wrong". This would appear to be the cause in my case: > > [root at master config]# diff env.mcr env.mcr-1062.9.1 > 4,5c4,5 > < #define LINUX_KERNEL_VERSION 31000999 > < #define LINUX_KERNEL_VERSION_VERBOSE 310001062009001 > --- >> #define LINUX_KERNEL_VERSION 31001062 >> #define LINUX_KERNEL_VERSION_VERBOSE 31001062009001 > > > ?the former having been generated by ?make Autoconfig? and the latter generated by my brain. I?m surprised at the first line ? I?d have caught myself that something different might have been needed if 3.10.0-1062 didn?t already fit in the number of digits. > > Anyway, I explained to support that the reason I do this is that I maintain a couple of copies of env.mcr because occasionally there will be reasons to need gpfs.gplbin for a few different kernel versions (other software that doesn't want to be upgraded, etc.). I see I originally got this practice from the README (or possibly our original installer consultants). > > Basically what?s missing here, so far as I can see, is a way to use mmbuildgpl/make Autoconfig but specify a target kernel version (and I guess an update to the docs or at least /usr/lpp/mmfs/src/README) that doesn?t suggest manually editing. Is there a way to at least find out what "make Autoconfig? would use for a target LINUX_KERNEL_VERSION_VERBOSE? From what I can see of makefile and config/configure, there?s no option for specifying anything. > > -- > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > >> On Jan 17, 2020, at 11:36 AM, Felipe Knop wrote: >> >> Hi Ryan, >> >> My interpretation of the analysis so far is that the content of LINUX_KERNEL_VERSION_VERBOSE in ' env.mcr' became incorrect. That is, it used to work well in a prior release of Scale, but not with 5.0.4.1 . This is because of a code change that added another digit to the version in LINUX_KERNEL_VERSION_VERBOSE to account for the 4-digit "fix level" (3.10.0-1000+) . Then, when the GPL layer was built, its sources saw the content of LINUX_KERNEL_VERSION_VERBOSE with the missing extra digit and compiled the 'wrong' pieces in -- in particular the incorrect value of SECURITY_INODE_INIT_SECURITY() . And that led to the crash. >> >> The problem did not happen when mmbuildgpl was used since the correct value of LINUX_KERNEL_VERSION_VERBOSE was then set up. >> >> Felipe >> >> ---- >> Felipe Knop knop at us.ibm.com >> GPFS Development and Security >> IBM Systems >> IBM Building 008 >> 2455 South Rd, Poughkeepsie, NY 12601 >> (845) 433-9314 T/L 293-9314 >> >> >> >> ----- Original message ----- >> From: Ryan Novosielski >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: gpfsug main discussion list >> Cc: >> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >> Date: Fri, Jan 17, 2020 10:56 AM >> >> That /is/ interesting. >> >> I?m a little confused about how that could be playing out in a case where I?m building on -1062.9.1, building for -1062.9.1, and running on -1062.9.1. Is there something inherent in the RPM building process that hasn?t caught up, or am I misunderstanding that change?s impact on it? >> >> -- >> ____ >> || \\UTGERS, |---------------------------*O*--------------------------- >> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus >> || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark >> `' >> >>> On Jan 17, 2020, at 10:35, Felipe Knop wrote: >>> >>> ? >>> Hi Ryan, >>> >>> Some interesting IBM-internal communication overnight. The problems seems related to a change made to LINUX_KERNEL_VERSION_VERBOSE to handle the additional digit in the kernel numbering (3.10.0-1000+) . The GPL layer expected LINUX_KERNEL_VERSION_VERBOSE to have that extra digit, and its absence resulted in an incorrect function being compiled in, which led to the crash. >>> >>> This, at least, seems to make sense, in terms of matching to the symptoms of the problem. >>> >>> We are still in internal debates on whether/how update our guidelines for gplbin generation ... >>> >>> Regards, >>> >>> Felipe >>> >>> ---- >>> Felipe Knop knop at us.ibm.com >>> GPFS Development and Security >>> IBM Systems >>> IBM Building 008 >>> 2455 South Rd, Poughkeepsie, NY 12601 >>> (845) 433-9314 T/L 293-9314 >>> >>> >>> >>> ----- Original message ----- >>> From: Ryan Novosielski >>> Sent by: gpfsug-discuss-bounces at spectrumscale.org >>> To: "gpfsug-discuss at spectrumscale.org" >>> Cc: >>> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum Scale Data Access Edition installed via gpfs.gplbin RPM on KVM guests >>> Date: Thu, Jan 16, 2020 4:33 PM >>> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi Felipe, >>> >>> I either misunderstood support or convinced them to take further >>> action. It at first looked like they were suggesting "mmbuildgpl fixed >>> it: case closed" (I know they wanted to close the SalesForce case >>> anyway, which would prevent communication on the issue). At this >>> point, they've asked for a bunch more information. >>> >>> Support is asking similar questions re: the speculations, and I'll >>> provide them with the relevant output ASAP, but I did confirm all of >>> that, including that there were no stray mmfs26/tracedev kernel >>> modules anywhere else in the relevant /lib/modules PATHs. In the >>> original case, I built on a machine running 3.10.0-957.27.2, but >>> pointed to the 3.10.0-1062.9.1 source code/defined the relevant >>> portions of usr/lpp/mmfs/src/config/env.mcr. That's always worked >>> before, and rebuilding once the build system was running >>> 3.10.0-1062.9.1 as well did not change anything either. In all cases, >>> the GPFS version was Spectrum Scale Data Access Edition 5.0.4-1. If >>> you build against either the wrong kernel version or the wrong GPFS >>> version, both will appear right in the filename of the gpfs.gplbin RPM >>> you build. Mine is called: >>> >>> gpfs.gplbin-3.10.0-1062.9.1.el7.x86_64-5.0.4-1.x86_64.rpm >>> >>> Anyway, thanks for your response; I know you might not be >>> following/working on this directly, but I figured the extra info might >>> be of interest. >>> >>> On 1/16/20 8:41 AM, Felipe Knop wrote: >>>> Hi Ryan, >>>> >>>> I'm aware of this ticket, and I understand that there has been >>>> active communication with the service team on this problem. >>>> >>>> The crash itself, as you indicate, looks like a problem that has >>>> been fixed: >>>> >>>> https://www.ibm.com/support/pages/ibm-spectrum-scale-gpfs-releases-423 >>> 13-or-later-and-5022-or-later-have-issues-where-kernel-crashes-rhel76-0 >>>> >>>> The fact that the problem goes away when *mmbuildgpl* is issued >>>> appears to point to some incompatibility with kernel levels and/or >>>> Scale version levels. Just speculating, some possible areas may >>>> be: >>>> >>>> >>>> * The RPM might have been built on a version of Scale without the >>>> fix * The RPM might have been built on a different (minor) version >>>> of the kernel * Somehow the VM picked a "leftover" GPFS kernel >>>> module, as opposed to the one included in gpfs.gplbin -- given >>>> that mmfsd never complained about a missing GPL kernel module >>>> >>>> >>>> Felipe >>>> >>>> ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM >>>> Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 >>>> (845) 433-9314 T/L 293-9314 >>>> >>>> >>>> >>>> >>>> ----- Original message ----- From: Ryan Novosielski >>>> Sent by: >>>> gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion >>>> list Cc: Subject: [EXTERNAL] >>>> [gpfsug-discuss] Kernel BUG/panic in mm/slub.c:3772 on Spectrum >>>> Scale Data Access Edition installed via gpfs.gplbin RPM on KVM >>>> guests Date: Wed, Jan 15, 2020 4:11 PM >>>> >>>> Hi there, >>>> >>>> I know some of the Spectrum Scale developers look at this list. >>>> I?m having a little trouble with support on this problem. >>>> >>>> We are seeing crashes with GPFS 5.0.4-1 Data Access Edition on KVM >>>> guests with a portability layer that has been installed via >>>> gpfs.gplbin RPMs that we built at our site and have used to >>>> install GPFS all over our environment. We?ve not seen this problem >>>> so far on any physical hosts, but have now experienced it on guests >>>> running on number of our KVM hypervisors, across vendors and >>>> firmware versions, etc. At one time I thought it was all happening >>>> on systems using Mellanox virtual functions for Infiniband, but >>>> we?ve now seen it on VMs without VFs. There may be an SELinux >>>> interaction, but some of our hosts have it disabled outright, some >>>> are Permissive, and some were working successfully with 5.0.2.x >>>> GPFS. >>>> >>>> What I?ve been instructed to try to solve this problem has been to >>>> run ?mmbuildgpl?, and it has solved the problem. I don?t consider >>>> running "mmbuildgpl" a real solution, however. If RPMs are a >>>> supported means of installation, it should work. Support told me >>>> that they?d seen this solve the problem at another site as well. >>>> >>>> Does anyone have any more information about this problem/whether >>>> there?s a fix in the pipeline, or something that can be done to >>>> cause this problem that we could remedy? Is there an easy place to >>>> see a list of eFixes to see if this has come up? I know it?s very >>>> similar to a problem that happened I believe it was after 5.0.2.2 >>>> and Linux 3.10.0-957.19.1, but that was fixed already in 5.0.3.x. >>>> >>>> Below is a sample of the crash output: >>>> >>>> [ 156.733477] kernel BUG at mm/slub.c:3772! [ 156.734212] invalid >>>> opcode: 0000 [#1] SMP [ 156.735017] Modules linked in: ebtable_nat >>>> ebtable_filter ebtable_broute bridge stp llc ebtables mmfs26(OE) >>>> mmfslinux(OE) tracedev(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) >>>> iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) >>>> mlx4_en(OE) mlx4_ib(OE) mlx4_core(OE) ip6table_nat nf_nat_ipv6 >>>> ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 >>>> ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat >>>> iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 >>>> xt_comment xt_multiport xt_conntrack nf_conntrack iptable_filter >>>> iptable_security nfit libnvdimm ppdev iosf_mbi crc32_pclmul >>>> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper >>>> ablk_helper sg joydev pcspkr cryptd parport_pc parport i2c_piix4 >>>> virtio_balloon knem(OE) binfmt_misc ip_tables xfs libcrc32c >>>> mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sr_mod cdrom ata_generic >>>> pata_acpi virtio_console virtio_net virtio_blk crct10dif_pclmul >>>> crct10dif_common mlx5_core(OE) mlxfw(OE) crc32c_intel ptp pps_core >>>> devlink ata_piix serio_raw mlx_compat(OE) libata virtio_pci floppy >>>> virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ >>>> 156.754814] CPU: 3 PID: 11826 Comm: request_handle* Tainted: G OE >>>> ------------ 3.10.0-1062.9.1.el7.x86_64 #1 [ 156.756782] >>>> Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 [ >>>> 156.757978] task: ffff8aeca5bf8000 ti: ffff8ae9f7a24000 task.ti: >>>> ffff8ae9f7a24000 [ 156.759326] RIP: 0010:[] >>>> [] kfree+0x13c/0x140 [ 156.760749] RSP: >>>> 0018:ffff8ae9f7a27278 EFLAGS: 00010246 [ 156.761717] RAX: >>>> 001fffff00000400 RBX: ffffffffbc6974bf RCX: ffffa74dc1bcfb60 [ >>>> 156.763030] RDX: 001fffff00000000 RSI: ffff8aed90fc6500 RDI: >>>> ffffffffbc6974bf [ 156.764321] RBP: ffff8ae9f7a27290 R08: >>>> 0000000000000014 R09: 0000000000000003 [ 156.765612] R10: >>>> 0000000000000048 R11: ffffdb5a82d125c0 R12: ffffa74dc4fd36c0 [ >>>> 156.766938] R13: ffffffffc0a1c562 R14: ffff8ae9f7a272f8 R15: >>>> ffff8ae9f7a27938 [ 156.768229] FS: 00007f8ffff05700(0000) >>>> GS:ffff8aedbfd80000(0000) knlGS:0000000000000000 [ 156.769708] CS: >>>> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.770754] CR2: >>>> 000055963330e2b0 CR3: 0000000325ad2000 CR4: 00000000003606e0 [ >>>> 156.772076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 [ 156.773367] DR3: 0000000000000000 DR6: >>>> 00000000fffe0ff0 DR7: 0000000000000400 [ 156.774663] Call Trace: [ >>>> 156.775154] [] >>>> cxiInitInodeSecurityCleanup+0x12/0x20 [mmfslinux] [ 156.776568] >>>> [] >>>> _Z17newInodeInitLinuxP15KernelOperationP13gpfsVfsData_tPP8OpenFilePPvP >>> P10gpfsNode_tP7FileUIDS6_N5LkObj12LockModeEnumE+0x152/0x290 >>>> >>>> >>> [mmfs26] >>>> [ 156.779378] [] >>>> _Z9gpfsMkdirP13gpfsVfsData_tP15KernelOperationP9cxiNode_tPPvPS4_PyS5_P >>> cjjjP10ext_cred_t+0x46a/0x7e0 >>>> >>>> >>> [mmfs26] >>>> [ 156.781689] [] ? >>>> _ZN14BaseMutexClass15releaseLockHeldEP16KernelSynchState+0x18/0x130 >>>> >>>> >>> [mmfs26] >>>> [ 156.783565] [] >>>> _ZL21pcacheHandleCacheMissP13gpfsVfsData_tP15KernelOperationP10gpfsNod >>> e_tPvPcPyP12pCacheResp_tPS5_PS4_PjSA_j+0x4bd/0x760 >>>> >>>> >>> [mmfs26] >>>> [ 156.786228] [] >>>> _Z12pcacheLookupP13gpfsVfsData_tP15KernelOperationP10gpfsNode_tPvPcP7F >>> ilesetjjjPS5_PS4_PyPjS9_+0x1ff5/0x21a0 >>>> >>>> >>> [mmfs26] >>>> [ 156.788681] [] ? >>>> _Z15findFilesetByIdP15KernelOperationjjPP7Filesetj+0x4f/0xa0 >>>> [mmfs26] [ 156.790448] [] >>>> _Z10gpfsLookupP13gpfsVfsData_tPvP9cxiNode_tS1_S1_PcjPS1_PS3_PyP10cxiVa >>> ttr_tPjP10ext_cred_tjS5_PiS4_SD_+0x65c/0xad0 >>>> >>>> >>> [mmfs26] >>>> [ 156.793032] [] ? >>>> _Z33gpfsIsCifsBypassTraversalCheckingv+0xe2/0x130 [mmfs26] [ >>>> 156.794588] [] gpfs_i_lookup+0x2e6/0x5a0 >>>> [mmfslinux] [ 156.795838] [] ? >>>> _Z8gpfsLinkP13gpfsVfsData_tP9cxiNode_tS2_PvPcjjP10ext_cred_t+0x6c0/0x6 >>> c0 >>>> >>>> >>> [mmfs26] >>>> [ 156.797753] [] ? __d_alloc+0x122/0x180 [ >>>> 156.798763] [] ? d_alloc+0x60/0x70 [ >>>> 156.799700] [] lookup_real+0x23/0x60 [ >>>> 156.800651] [] __lookup_hash+0x42/0x60 [ >>>> 156.801675] [] lookup_slow+0x42/0xa7 [ >>>> 156.802634] [] link_path_walk+0x80f/0x8b0 [ >>>> 156.803666] [] path_lookupat+0x7a/0x8b0 [ >>>> 156.804690] [] ? lru_cache_add+0xe/0x10 [ >>>> 156.805690] [] ? kmem_cache_alloc+0x35/0x1f0 [ >>>> 156.806766] [] ? getname_flags+0x4f/0x1a0 [ >>>> 156.807817] [] filename_lookup+0x2b/0xc0 [ >>>> 156.808834] [] user_path_at_empty+0x67/0xc0 [ >>>> 156.809923] [] ? handle_mm_fault+0x39d/0x9b0 [ >>>> 156.811017] [] user_path_at+0x11/0x20 [ >>>> 156.811983] [] vfs_fstatat+0x63/0xc0 [ >>>> 156.812951] [] SYSC_newstat+0x2e/0x60 [ >>>> 156.813931] [] ? trace_do_page_fault+0x56/0x150 >>>> [ 156.815050] [] SyS_newstat+0xe/0x10 [ >>>> 156.816010] [] system_call_fastpath+0x25/0x2a [ >>>> 156.817104] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 >>>> df e8 89 2f fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 >>>> e9 28 ff ff ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 >>>> 41 55 41 54 [ 156.822192] RIP [] >>>> kfree+0x13c/0x140 [ 156.823180] RSP [ >>>> 156.823872] ---[ end trace 142960be4a4feed8 ]--- [ 156.824806] >>>> Kernel panic - not syncing: Fatal exception [ 156.826475] Kernel >>>> Offset: 0x3ac00000 from 0xffffffff81000000 (relocation range: >>>> 0xffffffff80000000-0xffffffffbfffffff) >>>> >>>> -- ____ || \\UTGERS, >>>> |---------------------------*O*--------------------------- ||_// >>>> the State | Ryan Novosielski - novosirj at rutgers.edu || \\ >>>> University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS >>>> Campus || \\ of NJ | Office of Advanced Research Computing - >>>> MSB C630, Newark `' >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ gpfsug-discuss >>>> mailing list gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> - -- >>> ____ >>> || \\UTGERS, |----------------------*O*------------------------ >>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu >>> || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus >>> || \\ of NJ | Office of Advanced Res. Comp. - MSB C630, Newark >>> `' >>> -----BEGIN PGP SIGNATURE----- >>> >>> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXiDWSgAKCRCZv6Bp0Ryx >>> vpCsAKCQ2ykmeycbOVrHTGaFqb2SsU26NwCg3YyYi4Jy2d+xZjJkE6Vfht8O8gM= >>> =9rKb >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From heinrich.billich at id.ethz.ch Mon Jan 20 15:06:52 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:06:52 +0000 Subject: [gpfsug-discuss] How to install efix with yum ? In-Reply-To: References: <4385C4C6-845B-409C-81F2-56A54EBC1E3E@id.ethz.ch> <8ee7ad0a895442abb843688936ac4d73@deshaw.com> <3386ebe6-6d11-4f84-fad8-bac2e3c4416e@strath.ac.uk> Message-ID: Thank you, this did work. I did install efix9 for 5.0.4.1 using yum, just with a plain ?yum update? after installing the base version. I placed efix and base rpms in different yum repos and did disable the efix-repo while installing the base version, and vice versa. Kind regards, Heiner From: on behalf of IBM Spectrum Scale Reply to: gpfsug main discussion list Date: Wednesday, 15 January 2020 at 22:00 To: gpfsug main discussion list Cc: "gpfsug-discuss-bounces at spectrumscale.org" Subject: Re: [gpfsug-discuss] How to install efix with yum ? >> I don't see any yum options which match rpm's '--force' option. Actually, you do not need to use --force option since efix RPMs have incremental efix number in rpm name. Efix package provides update RPMs to be installed on top of corresponding PTF GA version. When you install 5.0.4.1 efix9, if 5.0.4.1 is already installed on your system, "yum update" should work. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generall]Jonathan Buzzard ---01/15/2020 02:09:33 PM---On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of a From: Jonathan Buzzard To: "gpfsug-discuss at spectrumscale.org" Date: 01/15/2020 02:09 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] How to install efix with yum ? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ On 15/01/2020 18:30, Sanchez, Paul wrote: > Yum generally only wants there to be single version of any package (it > is trying to eliminate conflicting provides/depends so that all of the > packaging requirements are satisfied). So this alien packaging practice > of installing an efix version of a package over the top of the base > version is not compatible with yum. I would at this juncture note that IBM should be appending the efix number to the RPM so that for example gpfs.base-5.0.4-1 becomes gpfs.base-5.0.4-1efix9 which would firstly make the problem go away, and second would allow one to know which version of GPFS you happen to have installed on a node without doing some sort of voodoo. > > The real issue for draconian sysadmins like us (whose systems must use > and obey yum) is that there are files (*liblum.so) which are provided by > the non-efix RPMS, but are not owned by the packages according to the > RPM database since they?re purposefully installed outside of RPM?s > tracking mechanism. > It worse than that because if you install the RPM directly yum/dnf then start bitching about the RPM database being modified outside of themselves and all sorts of useful information gets lost when you purge the package installation history to make the error go away. > We work around this by repackaging the three affected RPMS to include > the orphaned files from the original RPMs (and eliminating the related > but problematic checks from the RPMs? scripts) so that our efix RPMs > have been ?un-efix-ified? and will install as expected when using ?yum > upgrade?. To my knowledge no one?s published a way to do this, so we > all just have to figure this out and run rpmrebuild for ourselves. > IBM should be hanging their heads in shame if the replacement RPM is missing files. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:20:46 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:20:46 +0000 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Message-ID: Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner -------------- next part -------------- An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Mon Jan 20 15:15:33 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Mon, 20 Jan 2020 15:15:33 +0000 Subject: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? In-Reply-To: References: <79F871C6-A27D-4462-B124-CFB0891A36FD@id.ethz.ch> <4FF430B7-287E-4005-8ECE-E45820938BCA@id.ethz.ch> Message-ID: Hello Venkat, Thank you very much, upgrading to 5.0.4.1 did indeed fix the issue. AFM now compiles the list of pending changes in a few hours. Before we estimated >20days. We had to increase disk space in /var/mmfs/afm/ and /var/mmfs/tmp/ to allow AFM to store all intermediate file lists. The manual did recommend to provide much disk space in /var/mmfs/afm/ only, but some processes doing a resync placed lists in /var/mmfs/tmp/, too. Cheers, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Tuesday, 14 January 2020 at 17:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Hi, >The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? Yes, this is the major problem fixed as mentioned in the APAR below. The dirtyDirs file is opened for the each entry in the dirtyDirDirents file, and this causes the performance overhead. >At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? >There probably is no way to flush the pending queue entries while recovery is ongoing? Later versions have the fix mentioned in that APAR, and I believe it should fix the your current performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/13/2020 05:29 PM Subject: [EXTERNAL] Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello Venkat, thank you, this seems to match our issue. did trace tspcachescan and do see a long series of open()/read()/close() to the dirtyDirs file. The dirtyDirs file holds 130?000 lines, which don?t seem to be so many. But dirtyDirDirents holds about 80M entries. Can we estimate how long it will take to finish processing? tspcachescan does the following again and again for different directories 11:11:36.837032 stat("/fs3101/XXXXX/.snapshots/XXXXX.afm.75872/yyyyy/yyyy", {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0 11:11:36.837092 open("/var/mmfs/afm/fs3101-43/recovery/policylist.data.list.dirtyDirs", O_RDONLY) = 8 11:11:36.837127 fstat(8, {st_mode=S_IFREG|0600, st_size=32564140, ...}) = 0 11:11:36.837160 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff96930000 11:11:36.837192 read(8, "539492355 65537 2795 553648131 "..., 8192) = 8192 11:11:36.837317 read(8, "Caches/com.apple.helpd/Generated"..., 8192) = 8192 11:11:36.837439 read(8, "ish\n539848852 1509237202 2795 5"..., 8192) = 8192 Many more reads 11:11:36.864104 close(8) = 0 11:11:36.864135 munmap(0x3fff96930000, 8192) = 0 A single iteration takes about 27ms. Doing this 130?000 times would be o.k., but if tspcachescan does it 80M times we wait 600hours. Is there a way to estimate how many iteration tspcachescan will do? The cache fileset holds 140M inodes. At the moment all we can do is to wait? We run version 5.0.2.3. Would version 5.0.3 or 5.0.4 show a different behavior? Is this fixed/improved in a later release? There probably is no way to flush the pending queue entries while recovery is ongoing? I did open a case with IBM TS003219893 and will continue there. Kind regards, Heiner From: on behalf of Venkateswara R Puvvada Reply to: gpfsug main discussion list Date: Monday, 13 January 2020 at 08:40 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? AFM maintains in-memory queue at the gateway node to keep track of changes happening on the fileset. If the in-memory queue is lost (memory pressure, daemon shutdown etc..), AFM runs recovery process which involves creating the snapshot, running the policy scan and finally queueing the recovered operations. Due to message (operations) dependency, any changes to the AFM fileset during the recovery won't get replicated until the recovery the completion. AFM does the home directory scan for only dirty directories to get the names of the deleted and renamed files because old name for renamed file and deleted file name are not available at the cache on disk. Directories are made dirty when there is a rename or unlink operation is performed inside it. In your case it may be that all the directories became dirty due to the rename/unlink operations. AFM recovery process is single threaded. >Is this to be expected and normal behavior? What to do about it? >Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Only for the dirty directories, see above. >Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. There is some work going on to preserve the file names of the unlinked/renamed files in the cache until they get replicated to home so that home directory scan can be avoided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/08/2020 10:32 PM Subject: [EXTERNAL] [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode ? recovery first did run policies on the cache cluster, but now I see a ?tcpcachescan? process on cache slowly scanning home via nfs. Single host, single process, no parallelism as far as I can see, but I may be wrong. This scan of home on a cache afmgateway takes very long while further updates on cache queue up. Home has about 100M files. After 8hours I see about 70M entries in the file /var/mmfs/afm/?/recovery/homelist, i.e. we get about 2500 lines/s. (We may have very many changes on cache due to some recursive ACL operations, but I?m not sure.) So I expect that 12hours pass to buildup filelists before recovery starts to update home. I see some risk: In this time new changes pile up on cache. Memory may become an issue? Cache may fill up and we can?t evict? I wonder * Is this to be expected and normal behavior? What to do about it? * Will every reboot of a gateway node trigger a recovery of all afm filesets and a full scan of home? This would make normal rolling updates very unpractical, or is there some better way? Home is a gpfs cluster, hence we easily could produce the needed filelist on home with a policyscan in a few minutes. Thank you, I will welcome and clarification, advice or comments. Kind regards, Heiner . -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From vpuvvada at in.ibm.com Mon Jan 20 17:32:07 2020 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Mon, 20 Jan 2020 23:02:07 +0530 Subject: [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? In-Reply-To: References: Message-ID: While the recovery is running, reading the uncached files (evicted files) gets blocked until the recovery completes queueing the recovery operations. This is to make sure that recovery executes all the dependent operations first. For example, evicted file might have been renamed in the cache, but not yet replicated to home site and the fileset went into the recovery state. First recovery have to perform rename operation to the home site and then allow read operation on it. Read on the uncached files may get blocked if the cache state is in Recovery/NeedsResync/Unmounted/Dropped/Stopped states. ~Venkat (vpuvvada at in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 01/20/2020 08:50 PM Subject: [EXTERNAL] [gpfsug-discuss] Does an AFM recovery stop AFM from recalling files? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Do AFM recalls from home to cache still work when a fileset is in state ?Recovery?? Are there any other states that allow to write/read from cache but won?t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got report that evicted files weren?t available. NFS did work, I could read the files on home via the nfs mount in /var/mmfs/afm/-/. But AFM didn?t recall. If recalls are done by entries in the AFM Queue I see why, but is this the case? Kind regards, Heiner_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=zwJSIhs7R020CQybqTb86CxBGIhtULCJo_QEggx05Y4&s=TGxHcd4HcDF0hv621ilqJ56r26Ah4rlmNM7PcJ3yLEA&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkr at lbl.gov Thu Jan 23 22:16:20 2020 From: kkr at lbl.gov (Kristy Kallback-Rose) Date: Thu, 23 Jan 2020 14:16:20 -0800 Subject: [gpfsug-discuss] UPDATE Planning US meeting for Spring 2020 In-Reply-To: References: <42F45E03-0AEC-422C-B3A9-4B5A21B1D8DF@lbl.gov> Message-ID: Thanks for your responses to the poll. We?re still working on a venue, but working towards: March 30 - New User Day (Tuesday) April 1&2 - Regular User Group Meeting (Wednesday & Thursday) Once it?s confirmed we?ll post something again. Best, Kristy. > On Jan 6, 2020, at 3:41 PM, Kristy Kallback-Rose wrote: > > Thank you to the 18 wonderful people who filled out the survey. > > However, there are well more than 18 people at any given UG meeting. > > Please submit your responses today, I promise, it?s really short and even painless. 2020 (how did *that* happen?!) is here, we need to plan the next meeting > > Happy New Year. > > Please give us 2 minutes of your time here: https://forms.gle/NFk5q4djJWvmDurW7 > > Thanks, > Kristy > >> On Dec 16, 2019, at 11:05 AM, Kristy Kallback-Rose > wrote: >> >> Hello, >> >> It?s time already to plan for the next US event. We have a quick, seriously, should take order of 2 minutes, survey to capture your thoughts on location and date. It would help us greatly if you can please fill it out. >> >> Best wishes to all in the new year. >> >> -Kristy >> >> >> Please give us 2 minutes of your time here: ?https://forms.gle/NFk5q4djJWvmDurW7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From agostino.funel at enea.it Mon Jan 27 10:26:55 2020 From: agostino.funel at enea.it (Agostino Funel) Date: Mon, 27 Jan 2020 11:26:55 +0100 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? Message-ID: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel From S.J.Thompson at bham.ac.uk Mon Jan 27 10:29:52 2020 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 27 Jan 2020 10:29:52 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: References: Message-ID: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> 4.2.3 on Fix Central is called IBM Spectrum Scale, not GPFS. Try: https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.2.3&platform=Linux+64-bit,x86_64&function=all Simon ?On 27/01/2020, 10:27, "gpfsug-discuss-bounces at spectrumscale.org on behalf of agostino.funel at enea.it" wrote: Hi, I was trying to upgrade our IBM Spectrum Scale (General Parallel File System Standard Edition) version "4.2.3.2 " for Linux_x86 systems but from the Passport Advantage download site the only available versions are 5.* Moreover, from the Fix Central repository the only available patches are for the 4.1.0 version. How should I do? Thank you in advance. Best regards, Agostino Funel -- Agostino Funel DTE-ICT-HPC ENEA P.le E. Fermi 1 80055 Portici (Napoli) Italy Phone: (+39) 081-7723575 Fax: (+39) 081-7723344 E-mail: agostino.funel at enea.it WWW: http://www.afs.enea.it/funel _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From stockf at us.ibm.com Mon Jan 27 11:33:11 2020 From: stockf at us.ibm.com (Frederick Stock) Date: Mon, 27 Jan 2020 11:33:11 +0000 Subject: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version? In-Reply-To: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk> References: <78B13AB8-5258-426D-AC9E-5D4045A1E554@bham.ac.uk>, Message-ID: An HTML attachment was scrubbed... URL: From heinrich.billich at id.ethz.ch Wed Jan 29 13:05:30 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Wed, 29 Jan 2020 13:05:30 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Message-ID: Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Jan 30 14:43:54 2020 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 30 Jan 2020 15:43:54 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? The following GUI > refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui > nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time > which may lead to locking or congestion issues, actually the logs show messages like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ??little-endian ?server takes a long > time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? (5.0.3.2 efix4). We run a mix > of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From heinrich.billich at id.ethz.ch Thu Jan 30 15:13:06 2020 From: heinrich.billich at id.ethz.ch (Billich Heinrich Rainer (ID SD)) Date: Thu, 30 Jan 2020 15:13:06 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: <6FA0119F-1067-4528-A4D8-54FA61F19BE0@id.ethz.ch> Hello Uli, Thank you. Yes, I noticed that some commands like 'ipmitool fru' or 'rinv' take long or very long on le systems- i've seen up to 7 minutes. I tried to reset the bmc with 'ipmitool mc reset cold' but this breaks the os access to ipmi, you need to unload/load the kernel modules in the right order to fix - or reboot. I also needed to restart goconserver to restore the console connection. Hence resetting the bmc is no real option for little-endian ESS server. I don't know yet whether the bmc reset fixed anything. So we'll wait for 5.3.5 Kind regards, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== ?On 30.01.20, 15:44, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Ulrich Sibiller" wrote: On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From rohwedder at de.ibm.com Thu Jan 30 15:31:32 2020 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Thu, 30 Jan 2020 16:31:32 +0100 Subject: [gpfsug-discuss] =?utf-8?q?gui=5Frefresh=5Ftask=5Ffailed_for_HW?= =?utf-8?q?=5FINVENTORY_with_two=09active_GUI_nodes?= In-Reply-To: References: Message-ID: Hello, The GUI tasks which are not daily tasks will start periodically at a random time. The exception are daily tasks which are defined at fixed start times. It seems this is the issue you are experiencing, as the HW_INVENTORY task only runs once a day adn starts at identical times on both GUI nodes. Tweaking the cache database is unfortunately not a workaround as the hard coded and fixed starting times will be reset for every GUI restart. I have created a task to address this issue in a future release. We could for example add a random delay to the daily tasks, or a fixed delay based on the number of GUI nodes that are active. Mit freundlichen Gr??en / Kind regards Dr. Markus Rohwedder Spectrum Scale GUI Development Phone: +49 162 4159920 IBM Deutschland Research & Development E-Mail: rohwedder at de.ibm.com Am Weiher 24 65451 Kelsterbach Germany From: "Billich Heinrich Rainer (ID SD)" To: gpfsug main discussion list Date: 29.01.2020 14:41 Subject: [EXTERNAL] [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the task manually most times succeeds. We do run two gui nodes per cluster and I noted that both servers seem run the HW_INVENTORY at the exact same time which may lead to locking or congestion issues, actually the logs show messages like EFSSA0194I Waiting for concurrent operation to complete. The gui calls ?rinv? on the xCat servers. Rinv for a single little-endian server takes a long time ? about 2-3 minutes , while it finishes in about 15s for big-endian server. Hence the long runtime of rinv on little-endian systems may be an issue, too We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. Just to be sure I did purge the Posgresql tables. I did try /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. Thank you, Heiner -- ======================= Heinrich Billich ETH Z?rich Informatikdienste Tel.: +41 44 632 72 56 heinrich.billich at id.ethz.ch ======================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=3j7GTkRFLANP-V9nMPiOuUX-2D3ybbNTEc64kU-OQAM&s=sR1v63lEVWuEZTBgspG3imB0MN_-7ggA6zrmyvqfCzE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 14272346.gif Type: image/gif Size: 4659 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From ewahl at osc.edu Thu Jan 30 15:52:27 2020 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 30 Jan 2020 15:52:27 +0000 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: Interesting. We just deployed an ESS here and are running into a very similar problem with the gui refresh it appears. Takes my ppc64le's about 45 seconds to run rinv when they are idle. I had just opened a support case on this last evening. We're on ESS 5.3.4 as well. I will wait to see what support says. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Ulrich Sibiller Sent: Thursday, January 30, 2020 9:44 AM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Can I change the times at which the GUI runs HW_INVENTORY and related tasks? > > we frequently get ?messages like > > ?? gui_refresh_task_failed???? GUI?????????? WARNING???? 12 hours ago????? > The following GUI refresh task(s) failed: HW_INVENTORY > > The tasks fail due to timeouts. Running the task manually most times > succeeds. We do run two gui nodes per cluster and I noted that both > servers seem run the HW_INVENTORY at the exact same time which may > lead to locking or congestion issues, actually the logs show messages > like > > EFSSA0194I Waiting for concurrent operation to complete. > > The gui calls ?rinv? on the xCat servers. Rinv for a single ?? > little-endian ?server takes a long time ? about 2-3 minutes , while it finishes in ?about 15s for big-endian server. > > Hence the long runtime of rinv on little-endian systems may be an > issue, too > > We run 5.0.4-1 efix9 on the gui and ESS ?5.3.4.1 on the GNR systems? > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a separate xCat/ems server for each type. The GUI nodes are ppc64le. > > We did see this issue with several gpfs version on the gui and with at least two ESS/xCat versions. > > Just to be sure I did purge the Posgresql tables. > > I did try > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are difficult. I have seen the same on ppc64le. From time to time it recovers but then it starts again. The timeouts are okay, it is the hardware. I haven opened a call at IBM and they suggested upgrading to ESS 5.3.5 because of the new firmwares which I am currently doing. I can dig out more details if you want. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ From janfrode at tanso.net Thu Jan 30 16:59:40 2020 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 30 Jan 2020 17:59:40 +0100 Subject: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes In-Reply-To: References: Message-ID: I *think* this was a known bug in the Power firmware included with 5.3.4, and that it was fixed in the FW860.70. Something hanging/crashing in IPMI. -jf tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward : > Interesting. We just deployed an ESS here and are running into a very > similar problem with the gui refresh it appears. Takes my ppc64le's about > 45 seconds to run rinv when they are idle. > I had just opened a support case on this last evening. We're on ESS > 5.3.4 as well. I will wait to see what support says. > > Ed Wahl > Ohio Supercomputer Center > > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org < > gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Ulrich Sibiller > Sent: Thursday, January 30, 2020 9:44 AM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY > with two active GUI nodes > > On 1/29/20 2:05 PM, Billich Heinrich Rainer (ID SD) wrote: > > Hello, > > > > Can I change the times at which the GUI runs HW_INVENTORY and related > tasks? > > > > we frequently get messages like > > > > gui_refresh_task_failed GUI WARNING 12 hours > ago > > The following GUI refresh task(s) failed: HW_INVENTORY > > > > The tasks fail due to timeouts. Running the task manually most times > > succeeds. We do run two gui nodes per cluster and I noted that both > > servers seem run the HW_INVENTORY at the exact same time which may > > lead to locking or congestion issues, actually the logs show messages > > like > > > > EFSSA0194I Waiting for concurrent operation to complete. > > > > The gui calls ?rinv? on the xCat servers. Rinv for a single > > little-endian server takes a long time ? about 2-3 minutes , while it > finishes in about 15s for big-endian server. > > > > Hence the long runtime of rinv on little-endian systems may be an > > issue, too > > > > We run 5.0.4-1 efix9 on the gui and ESS 5.3.4.1 on the GNR systems > > (5.0.3.2 efix4). We run a mix of ppc64 and ppc64le systems, which a > separate xCat/ems server for each type. The GUI nodes are ppc64le. > > > > We did see this issue with several gpfs version on the gui and with at > least two ESS/xCat versions. > > > > Just to be sure I did purge the Posgresql tables. > > > > I did try > > > > /usr/lpp/mmfs/gui/cli/lstasklog HW_INVENTORY > > > > /usr/lpp/mmfs/gui/cli/runtask HW_INVENTORY ?debug > > > > And also tried to read the logs in /var/log/cnlog/mgtsrv/ - but they are > difficult. > > > I have seen the same on ppc64le. From time to time it recovers but then it > starts again. The timeouts are okay, it is the hardware. I haven opened a > call at IBM and they suggested upgrading to ESS 5.3.5 because of the new > firmwares which I am currently doing. I can dig out more details if you > want. > > Uli > -- > Science + Computing AG > Vorstandsvorsitzender/Chairman of the board of management: > Dr. Martin Matzke > Vorstand/Board of Management: > Matthias Schempp, Sabine Hohenstein > Vorsitzender des Aufsichtsrats/ > Chairman of the Supervisory Board: > Philippe Miltin > Aufsichtsrat/Supervisory Board: > Martin Wibbe, Ursula Morgenstern > Sitz/Registered Office: Tuebingen > Registergericht/Registration Court: Stuttgart Registernummer/Commercial > Register No.: HRB 382196 _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!gqw1FGbrK5S4LZwnuFxwJtT6l9bm5S5mMjul3tadYbXRwk0eq6nesPhvndYl$ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: