From Renar.Grunenberg at huk-coburg.de Sat Dec 1 10:52:21 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Sat, 1 Dec 2018 10:52:21 +0000 Subject: [gpfsug-discuss] Fwd: Status for Alert: remotely mounted filesystem panic on accessing cluster after upgrading the owning cluster first In-Reply-To: <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> References: , <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> Message-ID: <5DEF2C75-EA8F-4A2F-B0A7-391DCECFAB6E@huk-coburg.de> Hallo All, We updated today our owning cluster with 5.0.2.1.. After that we testet our Case and our Problem seems to be fixed. Thanks to all for the hints. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= From alvise.dorigo at psi.ch Thu Dec 6 09:22:48 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 09:22:48 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Dec 6 11:35:59 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 6 Dec 2018 17:05:59 +0530 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Message-ID: Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Dec 6 16:06:03 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 16:06:03 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADBC5@MBX114.d.ethz.ch> Hi, to be precise, it seems that Infiniband metrics has disappeared, as confirmed by its lack in the official documentation: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_listofmetricsPMT.htm So the refined question is: How can we monitor infiniband using zimon sensors ? Thanks, Alvise ________________________________ From: Karthik G Iyer1 [karthik.iyer at in.ibm.com] on behalf of IBM Spectrum Scale [scale at us.ibm.com] Sent: Thursday, December 06, 2018 12:35 PM To: Mathias Dietz Cc: gpfsug main discussion list; Dorigo Alvise (PSI) Subject: Re: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdratlif at iu.edu Thu Dec 6 16:36:35 2018 From: jdratlif at iu.edu (Ratliff, John) Date: Thu, 6 Dec 2018 16:36:35 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Message-ID: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> We're trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We're running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we've tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0 x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage - Indiana University | http://pti.iu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5670 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Thu Dec 6 17:03:28 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Thu, 6 Dec 2018 17:03:28 +0000 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Message-ID: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Dec 6 17:15:15 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 6 Dec 2018 12:15:15 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: Hopefully you are aware that GPFS 3.5 has been out of service since April 2017 unless you are on extended service. Might be a good time to consider upgrading. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Ratliff, John" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 11:53 AM Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Sent by: gpfsug-discuss-bounces at spectrumscale.org We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu [attachment "smime.p7s" deleted by Frederick Stock/Pittsburgh/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.robinson02 at gmail.com Thu Dec 6 17:41:50 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Thu, 6 Dec 2018 12:41:50 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: call IBM you ass,. submit an SOS report a\nd fuckofff' On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John wrote: > We?re trying to run a policy scan to get a list of all the files in one of > our filesets. There are approximately 600 million inodes in this space. > We?re running GPFS 3.5. Every time we run the policy scan, the node that is > running it ends up crashing. It makes it through a quarter of the inodes > before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS > logs shows anything. It just notes that the node rebooted. > > > > In the crash logs of all the systems we?ve tried this on, we see the same > line. > > > > <1>BUG: unable to handle kernel NULL pointer dereference at > 00000000000000d8 > > <1>IP: [] > _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 > [mmfs26] > > > > Our policy scan rule is pretty simple: > > > > RULE 'list-homedirs' > > LIST 'list-homedirs' > > > > mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N > gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 > > > > Has anyone experienced something like this or have any suggestions on what > to do to avoid it? > > > > Thanks. > > > > John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? > Indiana University | http://pti.iu.edu > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Dec 6 18:47:05 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:47:05 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior Message-ID: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> I've been trying to chase down an error one of our users periodically sees with Intel MPI. The body of the error is this: This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno 25. - If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching). - If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option. ADIOI_Set_lock:: No locks available ADIOI_Set_lock:offset 0, length 8 When this happens, a new job is reading back-in the checkpoint files a previous job wrote. Consistently it's the reading in of previously written files that triggers this although the occurrence is sporadic and if the job retries enough times the error will go away. The really curious thing, is there is only one byte range lock per file per-node open at any time, so the error 37 (I know it says 25 but that's actually in hex even though it's not prefixed with 0x) of being out of byte range locks is a little odd to me. The default is 200 but we should be no way near that. I've been trying to frantically chase this down with various MPI reproducers but alas I came up short, until this morning, when I gave up on the MPI approach and tried something a little more simple. I've discovered that when: - A file is opened by node A (a key requirement to reproduce seems to be that node A is *also* the metanode for the file. I've not been able to reproduce if node A is *not* the metanode) - Node A Acquires a bunch of write locks in the file - Node B then also acquires a bunch of write locks in the file - Node B then acquires a bunch of read locks in the file - Node A then also acquires a bunch of read locks in the file At that last step, Node A will experience the errno 37 attempting to acquire read locks. Here are the actual commands to reproduce this (source code for fcntl_stress.c is attached): Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero of=/gpfs/aaronFS/testFile bs=1M count=4000 Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Now that I've typed this out, I realize this really should be a PMR not a post to the mailing list :) but I thought it was interesting and wanted to share. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 -------------- next part -------------- /* Aaron Knister Program to acquire a bunch of byte range locks in a file */ #include #include #include #include #include #include #include int main(int argc, char **argv) { char *filename; int fd; struct stat statBuf; int highRand; int lowRand; unsigned int l_start = 0; unsigned int l_len; int openMode; int lockType; struct flock lock; unsigned int stride; filename = argv[1]; stride = atoi(argv[2]); l_len = atoi(argv[3]); if ( argc > 4 ) { openMode = O_WRONLY; lockType = F_WRLCK; } else { openMode = O_RDONLY; lockType = F_RDLCK; } printf("Opening file '%s' in %s mode. stride = %d. l_len = %d\n", filename, (openMode == O_WRONLY) ? "write" : "read", stride, l_len); assert( (fd = open(filename, openMode)) >= 0 ); assert( fstat(fd, &statBuf) == 0 ); while(1) { if ( l_start >= statBuf.st_size ) { break; l_start = 0; } highRand = rand(); lowRand = rand(); lock.l_type = lockType; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; if (fcntl(fd, F_SETLKW, &lock) != 0) { fprintf(stderr, "Non-zero return from fcntl. errno = %d (%s)\n", errno, strerror(errno)); abort(); } lock.l_type = F_UNLCK; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; assert(fcntl(fd, F_SETLKW, &lock) != -1); l_start += stride; } } From aaron.s.knister at nasa.gov Thu Dec 6 18:56:44 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:56:44 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior In-Reply-To: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> References: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> Message-ID: <5a5c70c6-c67f-1b5e-77ae-ef38553cb7e4@nasa.gov> Just for the sake of completeness, when the test program fails in the expected fashion this is the message it prints: Opening file 'read' in /gpfs/aaronFS/testFile mode. stride = 1048576 l_len = 262144 Non-zero return from fcntl. errno = 37 (No locks available) Aborted -Aaron On 12/6/18 1:47 PM, Aaron Knister wrote: > I've been trying to chase down an error one of our users periodically > sees with Intel MPI. The body of the error is this: > > This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. > Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd > F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno > 25. > - If the file system is NFS, you need to use NFS version 3, ensure that > the lockd daemon is running on all the machines, and mount the directory > with the 'noac' option (no attribute caching). > - If the file system is LUSTRE, ensure that the directory is mounted > with the 'flock' option. > ADIOI_Set_lock:: No locks available > ADIOI_Set_lock:offset 0, length 8 > > When this happens, a new job is reading back-in the checkpoint files a > previous job wrote. Consistently it's the reading in of previously > written files that triggers this although the occurrence is sporadic and > if the job retries enough times the error will go away. > > The really curious thing, is there is only one byte range lock per file > per-node open at any time, so the error 37 (I know it says 25 but that's > actually in hex even though it's not prefixed with 0x) of being out of > byte range locks is a little odd to me. The default is 200 but we should > be no way near that. > > I've been trying to frantically chase this down with various MPI > reproducers but alas I came up short, until this morning, when I gave up > on the MPI approach and tried something a little more simple. I've > discovered that when: > > - A file is opened by node A (a key requirement to reproduce seems to be > that node A is *also* the metanode for the file. I've not been able to > reproduce if node A is *not* the metanode) > - Node A Acquires a bunch of write locks in the file > - Node B then also acquires a bunch of write locks in the file > - Node B then acquires a bunch of read locks in the file > - Node A then also acquires a bunch of read locks in the file > > At that last step, Node A will experience the errno 37 attempting to > acquire read locks. > > Here are the actual commands to reproduce this (source code for > fcntl_stress.c is attached): > > Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero > of=/gpfs/aaronFS/testFile bs=1M count=4000 > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > > Now that I've typed this out, I realize this really should be a PMR not > a post to the mailing list :) but I thought it was interesting and > wanted to share. > > -Aaron > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From S.J.Thompson at bham.ac.uk Thu Dec 6 19:24:33 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Thu, 6 Dec 2018 19:24:33 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu>, Message-ID: Just a gentle reminder that this is a community based list and that we expect people to be respectful of each other on the list. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of matthew.robinson02 at gmail.com [matthew.robinson02 at gmail.com] Sent: 06 December 2018 17:41 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS nodes crashing during policy scan On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John > wrote: We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. From aaron.s.knister at nasa.gov Fri Dec 7 13:38:35 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[InuTeq, LLC]) Date: Fri, 7 Dec 2018 13:38:35 +0000 Subject: [gpfsug-discuss] Test? Message-ID: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> I sent a couple messages to the list earlier that made it to the archives online but seemingly never made it to anyone else I talked to. I?m curious to see if this message goes through. -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Fri Dec 7 13:47:12 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Fri, 7 Dec 2018 14:47:12 +0100 Subject: [gpfsug-discuss] SAVE the date - Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 In-Reply-To: References: Message-ID: Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 https://www.ibm.com/events/wwe/grp/grp308.nsf/Agenda.xsp?openform&seminar=Z94GKRES&locale=de_DE Save the date :-) -frank- Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Fri Dec 7 22:05:53 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Fri, 7 Dec 2018 22:05:53 +0000 Subject: [gpfsug-discuss] Test? In-Reply-To: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> References: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From scale at us.ibm.com Fri Dec 7 22:55:06 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Fri, 7 Dec 2018 14:55:06 -0800 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -UserRequirements In-Reply-To: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> References: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Message-ID: Hello, Unfortunately, to allow bidirectional passwordless ssh between Linux/Windows (for sole purpose of mm* commands), the literal username 'root' is a requirement. Here are a few variations. 1. Use domain account 'root', where 'root' belongs to "Domain Admins" group. This is the easiest 1-step and the recommended way. or 2. Use domain account 'root', where 'root' does NOT belong to "Domain Admins" group. In this case, on each and every GPFS Windows node, add this 'domain\root' account to local "Administrators" group. or 3. On each and every GPFS Windows node, create a local 'root' account as a member of local "Administrators" group. (1) and (2) work well reliably with Cygwin. I have seen inconsistent results with approach (3) wherein Cygwin passwordless ssh in incoming direction (linux->windows) sometimes breaks and prompts for password. Give it a try to see if you get better results. If you cannot get around the 'root' literal username requirement, the suggested alternative is to use GPFS multi-clustering. Create a separate cluster of all Windows-only nodes (using mmwinrsh/mmwinrcp instead of ssh/scp... so that 'root' requirement is eliminated). And then remote mount from the Linux cluster (all non-Windows nodes) via mmauth, mmremotecluster and mmremotefs et al. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Grunenberg, Renar" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 09:05 AM Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Sent by: gpfsug-discuss-bounces at spectrumscale.org Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Tue Dec 11 11:17:10 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Tue, 11 Dec 2018 12:17:10 +0100 Subject: [gpfsug-discuss] Spectrum Scale *News* Dec 12th 2018 Message-ID: IBM SpectrumAI with NVIDIA DGX is the key to data science productivity https://www.ibm.com/blogs/systems/introducing-spectrumai-with-nvidia-dgx/ To drive AI development productivity and streamline the AI data pipeline, IBM is introducing IBM Spectrum AI with NVIDIA DGX. This converged solution combines the industry acclaimed software-defined storage scale-out file system, IBM Spectrum Scale on flash with NVIDIA DGX-1. It provides the highest performance in any tested converged system with the unique ability to support a growing data science practice. https://public.dhe.ibm.com/common/ssi/ecm/81/en/81022381usen/ibm-spectrumai-ref-arch-dec10-v6_81022381USEN.pdf Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Dec 13 13:52:42 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 13 Dec 2018 14:52:42 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From jcram at ddn.com Fri Dec 14 14:45:52 2018 From: jcram at ddn.com (Jeno Cram) Date: Fri, 14 Dec 2018 14:45:52 +0000 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: Are you using Extended attributes on the directories in question? Jeno Cram | Systems Engineer Mobile: 517-980-0495 jcram at ddn.com DDN.com ?On 12/13/18, 9:02 AM, "Ulrich Sibiller" wrote: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Thu Dec 13 20:54:39 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 13 Dec 2018 20:54:39 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? Message-ID: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Hi All, Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? Thanks all? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sat Dec 15 14:44:25 2018 From: aaron.knister at gmail.com (Aaron Knister) Date: Sat, 15 Dec 2018 09:44:25 -0500 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <9955EFB8-B50B-4F8C-8803-468626BE23FB@gmail.com> Hi Kevin, I don?t have any experience running GPFS over iSCSI (although does iSER count?). For what it?s worth there are, I believe, 2 vendors in the FC space? Cisco and Brocade. You can get a fully licensed 16GB Cisco MDS edge switch for not a whole lot (https://m.cdw.com/product/cisco-mds-9148s-switch-48-ports-managed-rack-mountable/3520640). If you start with fewer ports licensed the cost goes down dramatically. I?ve also found that, if it?s an option, IB is stupidly cheap from a dollar per unit of bandwidth perspective and makes a great SAN backend. One other, other thought is that FCoE seems very attractive. Not sure if your arrays support that but I believe you get closer performance and behavior to FC with FCoE than with iSCSI and I don?t think there?s a huge cost difference. It?s even more fun if you have multi-fabric FC switches that can do FC and FCoE because you can in theory bridge the two fabrics (e.g. use FCoE on your NSD servers to some 40G switches that support DCB and connect the 40G eth switches to an FC/FCoE switch and then address your 8Gb FC storage and FCoE storage using the same fabric). -Aaron Sent from my iPhone > On Dec 13, 2018, at 15:54, Buterbaugh, Kevin L wrote: > > Hi All, > > Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? > > The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. > > As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. > > So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? > > As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! > > Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? > > Thanks all? > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 16 11:59:39 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 16 Dec 2018 12:59:39 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Sun Dec 16 13:45:47 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Sun, 16 Dec 2018 14:45:47 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer : > Kevin, > > Ethernet networking of today is changing very fast as the driving forces > are the "Hyperscale" datacenters. This big innovation is changing the world > and is happening right now. You must understand the conversation by > breaking down the differences between ASICs, FPGAs, and NPUs in modern > Ethernet networking. > > 1) Mellanox has a very good answer here based on the Spectrum-2 chip > http://www.mellanox.com/page/press_release_item?id=1933 > > 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 > Ethernet Switch Series > > https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series > > https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html > > 3) Barefoots Tofinio2 is another valid answer to this problem as it's > programmable with the P4 language (important for Hyperscale Datacenters) > https://www.barefootnetworks.com/ > > The P4 language itself is open source. There?s details at p4.org, or you > can download code at GitHub: https://github.com/p4lang/ > > 4) The last newcomer to this party comes from Innovium named Teralynx > https://innovium.com/products/teralynx/ > > https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ > > (Most of the new Cisco switches are powered by the Teralynx silicon, as > Cisco seems to be late to this game with it's own development.) > > So back your question - iSCSI is not the future! NVMe and it's variants is > the way to go and these new ethernet swichting products does have this in > focus. > Due to the performance demands of NVMe, high performance and low latency > networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are > the leading choices. > > -frank- > > P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be > a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to > make use of all these new things and options :-) > > Frank Kraemer > IBM Consulting IT Specialist / Client Technical Architect > Am Weiher 24, 65451 Kelsterbach, Germany > mailto:kraemerf at de.ibm.com > Mobile +49171-3043699 > IBM Germany > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Sun Dec 16 17:25:13 2018 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Sun, 16 Dec 2018 17:25:13 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Using iSCSI with Spectrum Scale is definitely do-able. As with running Scale in general, your networking needs to be very solid. For iSCSI the best practice I?m aware of is the dedicated/simple approach described by JF below: one subnet per switch (failure domain), nothing fancy like VRRP/HSRP/STP, and let multipathd do its job at ensuring that the available paths are the ones being used. We have also had some good experiences using routed iSCSI (which fits the rackscale/hyperscale style deployment model too, but this implies that you have a good QoS plan to assure that markings are correct and any link which can become congested can?t completely starve the dedicated queue you should be using for iSCSI. It?s also a good practice for the other TCP traffic in your non-iSCSI queue to use ECN in order to keep switch buffer utilization low. (As of today, I haven?t seen any iSCSI arrays which support ECN.) If you?re sharing arrays with multiple clusters/filesystems (i.e. not a single workload), then I would also recommend using iSCSI arrays which support per-volume/volume-group QOS limits to avoid noisy-neighbor problems in the iSCSI realm. As of today, there are even 100GbE capable all-flash solutions available which work well with Scale. Lastly, I?d say that iSCSI might not be the future? but NVMeOF hasn?t exactly given us many products ready to be the present. Most of the early offerings in this space are under-featured, over-priced, inflexible, proprietary, or fragile. We are successfully using non-standards based NVMe solutions today with Scale, but they have much more stringent and sensitive networking requirements (e.g. non-routed dedicated networking with PFC for RoCE) in order to provide reliable performance. So far, we?ve found these early offerings best-suited for single-workload use cases. I do expect this to continue to develop and improve on price, features, reliability/fragility. Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Jan-Frode Myklebust Sent: Sunday, December 16, 2018 8:46 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Anybody running GPFS over iSCSI? - I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer >: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 00:21:57 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 00:21:57 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: [SNIP] > > Two things that I am already aware of are: ?1) use jumbo frames, and 2) > run iSCSI over it?s own private network. ?Other things I should be aware > of?!? > Yes, don't do it. Really do not do it unless you have datacenter Ethernet switches and adapters. Those are the ones required for FCoE. Basically unless you have per channel pause on your Ethernet fabric then performance will at some point all go to shit. So what happens is your NSD makes a whole bunch of requests to read blocks off the storage array. Requests are small, response is not. The response can overwhelms the Ethernet channel at which point performance falls through the floor. Now you might be lucky not to see this, especially if you have say have 10Gbps links from the storage and 40Gbps links to the NSD servers, but you are taking a gamble. Also the more storage arrays you have the more likely you are to see the problem. To fix this you have two options. The first is datacenter Ethernet with per channel pause. This option is expensive, probably in the same ball park as fibre channel. At least it was last time I looked, though this was some time ago now. The second option is dedicated links between the storage array and the NSD server. That is the cable goes directly between the storage array and the NSD server with no switches involved. This option is a maintenance nightmare. At he site where I did this, we had to go option two because I need to make it work, We ended up ripping it all out are replacing with FC. Personally I would see what price you can get DSS storage for, or use SAS arrays. Note iSCSI can in theory work, it's just the issue with GPFS scattering stuff to the winds over multiple storage arrays so your ethernet channel gets swamped and standard ethernet pauses all the upstream traffic. The vast majority of iSCSI use cases don't see this effect. There is a reason that to run FC over ethernet they had to turn ethernet lossless. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From janfrode at tanso.net Mon Dec 17 07:50:01 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 17 Dec 2018 08:50:01 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: I?d be curious to hear if all these arguments against iSCSI shouldn?t also apply to NSD protocol over TCP/IP? -jf man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard < jonathan.buzzard at strath.ac.uk>: > On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > Two things that I am already aware of are: 1) use jumbo frames, and 2) > > run iSCSI over it?s own private network. Other things I should be aware > > of?!? > > > > Yes, don't do it. Really do not do it unless you have datacenter > Ethernet switches and adapters. Those are the ones required for FCoE. > Basically unless you have per channel pause on your Ethernet fabric then > performance will at some point all go to shit. > > So what happens is your NSD makes a whole bunch of requests to read > blocks off the storage array. Requests are small, response is not. The > response can overwhelms the Ethernet channel at which point performance > falls through the floor. Now you might be lucky not to see this, > especially if you have say have 10Gbps links from the storage and 40Gbps > links to the NSD servers, but you are taking a gamble. Also the more > storage arrays you have the more likely you are to see the problem. > > To fix this you have two options. The first is datacenter Ethernet with > per channel pause. This option is expensive, probably in the same ball > park as fibre channel. At least it was last time I looked, though this > was some time ago now. > > The second option is dedicated links between the storage array and the > NSD server. That is the cable goes directly between the storage array > and the NSD server with no switches involved. This option is a > maintenance nightmare. > > At he site where I did this, we had to go option two because I need to > make it work, We ended up ripping it all out are replacing with FC. > > Personally I would see what price you can get DSS storage for, or use > SAS arrays. > > Note iSCSI can in theory work, it's just the issue with GPFS scattering > stuff to the winds over multiple storage arrays so your ethernet channel > gets swamped and standard ethernet pauses all the upstream traffic. The > vast majority of iSCSI use cases don't see this effect. > > There is a reason that to run FC over ethernet they had to turn ethernet > lossless. > > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Mon Dec 17 08:50:54 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Mon, 17 Dec 2018 08:50:54 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: , <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Dec 17 10:22:26 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 17 Dec 2018 11:22:26 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: <57c7a736-0774-2701-febc-3cdc57a50d86@science-computing.de> On 14.12.2018 15:45, Jeno Cram wrote: > Are you using Extended attributes on the directories in question? No. What's the background of your question? Kind regards, Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From S.J.Thompson at bham.ac.uk Mon Dec 17 15:07:39 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 15:07:39 +0000 Subject: [gpfsug-discuss] GPFS API Message-ID: Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Dec 17 15:47:18 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 17 Dec 2018 10:47:18 -0500 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: Look in gps.h.... I think the comment for acl_buffer_len is clear enough! /* Mapping of buffer for gpfs_getacl, gpfs_putacl. */ typedef struct gpfs_opaque_acl { int acl_buffer_len; /* INPUT: Total size of buffer (including this field). OUTPUT: Actual size of the ACL information. */ unsigned short acl_version; /* INPUT: Set to zero. OUTPUT: Current version of the returned ACL. */ unsigned char acl_type; /* INPUT: Type of ACL: access (1) or default (2). */ char acl_var_data[1]; /* OUTPUT: Remainder of the ACL information. */ } gpfs_opaque_acl_t; From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 12/17/2018 10:13 AM Subject: [gpfsug-discuss] GPFS API Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:22:35 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:22:35 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <7d438bada357989f56eeb6f347ba2ddb5661679a.camel@strath.ac.uk> On Mon, 2018-12-17 at 08:50 +0100, Jan-Frode Myklebust wrote: > I?d be curious to hear if all these arguments against iSCSI shouldn?t > also apply to NSD protocol over TCP/IP? > They don't is the simple answer. Either theoretically or practically. It won't necessarily be a problem for iSCSI either. It is the potential for wild over subscription of the links with the total naivety of the block based iSCSI protocol that is the issue. I suspect that with NSD traffic it is self limiting. That said looking about higher end switches seem to do DCE now, though FCoE seems to have gone nowhere. Anyway as I said if you want lower cost use SAS. Even if you don't want to do ESS/SSS you can architecture something very similar using SAS based arrays from your favourite vendor, and just skip the native RAID bit. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:46:32 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:46:32 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Mon Dec 17 17:35:31 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 17:35:31 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> References: , <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: Indeed. Actually this was exactly what we were trying to work out. We'd set the buf size to 0 hoping it would tell us how much we need, but we kept getting EINVAL back - which the docs states is invalid path, but actually it can be invalid bufsize as well apparently (the header file comments are different again to the docs). Anyway, we're looking at patching mpifileutils to support GPFS ACLs to help with migration of between old and new file-systems. I was actually using the opaque call on the assumption that it would be a binary blob of data I could poke to the new file. (I was too scared to use the attr functions as that copies DMAPI info as well and I'm not sure I want to "copy" my ILM files without recall!). Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing something to do with inheritance maybe? Thanks Marc, Jonathan for the pointers! Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 16:46 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS API On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Mon Dec 17 22:44:14 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 22:44:14 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 23:04:15 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 23:04:15 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <6e34f47e-4fc0-9726-0c1a-38720f5c26db@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). Forgot to say performance wise calling with effectively a zero buffer size and then calling again with precisely the right buffer size is not sensible plan. Basically you end up having to do two API calls instead of one to save less than 1KB of RAM in 99.999% of cases. Let's face it any machine running GPFS is going to have GB of RAM. That said I am not sure even allocating 1KB of RAM is sensible either. One suspects allocating less than a whole page might have performance implications. What I do know from testing is that, two API calls when iterating over millions of files has a significant impact on run time over just allocating a bunch of memory up front on only making the one call. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Tue Dec 18 07:05:18 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 18 Dec 2018 07:05:18 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> , <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> Message-ID: No, it comes back as the first four bytes of the structure in the size field. So if we set it the data field to zero, then it still has space in the size filed to return the required size. That was sort of the point.... The size value needs to be the size of the struct rather than the data field within it. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 22:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS API On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Mon Dec 17 22:01:41 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 17 Dec 2018 22:01:41 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy Message-ID: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Dec 18 09:13:05 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 18 Dec 2018 10:13:05 +0100 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: I don't think it's going to figure this out automatically between the two rules.. I believe you will need to do something like the below (untested, and definitely not perfect!!) rebalancing: define( weight_junkedness, (CASE /* Create 3 classes of files */ /* ReallyOldStuff */ WHEN ((access_age > 365) AND (KB_ALLOCATED > 3584)) THEN 1000 /* OldStuff */ WHEN ((access_age > 90) AND (KB_ALLOCATED > 3584)) THEN 100 /* everything else */ ELSE 0 END) ) RULE 'defineTiers' GROUP POOL 'TIERS' IS 'data' LIMIT(98) THEN 'capacity' LIMIT(98) THEN 'oc' RULE 'Rebalance' MIGRATE FROM POOL 'TIERS' TO POOL 'TIERS' WEIGHT(weight_junkedness) Based on /usr/lpp/mmfs/samples/ilm/mmpolicy-fileheat.sample. -jf On Tue, Dec 18, 2018 at 9:10 AM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi All, > > As those of you who suffered thru my talk at SC18 already know, we?re > really short on space on one of our GPFS filesystems as the output of mmdf > piped to grep pool shows: > > Disks in storage pool: system (Maximum disk size allowed is 24 TB) > (pool total) 4.318T 1.078T ( 25%) > 79.47G ( 2%) > Disks in storage pool: data (Maximum disk size allowed is 262 TB) > (pool total) 494.7T 38.15T ( 8%) > 4.136T ( 1%) > Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) > (pool total) 640.2T 14.56T ( 2%) > 716.4G ( 0%) > > The system pool is metadata only. The data pool is the default pool. The > capacity pool is where files with an atime (yes, atime) > 90 days get > migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs > of 8 TB drives, so roughly 58.2 TB usable space per NSD. > > We have the new storage we purchased, but that?s still being tested and > held in reserve for after the first of the year when we create a new GPFS 5 > formatted filesystem and start migrating everything to the new filesystem. > > In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB > drives and will be hooking it up to one of our existing storage arrays on > Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present > those to GPFS as NSDs. They will be about 88 TB usable space each (because > ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me > started on so-called ?4K? TV?s ? end rant). > > A very wise man who used to work at IBM but now hangs out with people in > red polos () once told me that it?s OK to mix NSDs of slightly > different sizes in the same pool, but you don?t want to put NSDs of vastly > different sizes in the same pool because the smaller ones will fill first > and then the larger ones will have to take all the I/O. I consider 58 TB > and 88 TB to be pretty significantly different and am therefore planning on > creating yet another pool called ?oc? (over capacity if a user asks, old > crap internally!) and migrating files with an atime greater than, say, 1 > year to that pool. But since ALL of the files in the capacity pool haven?t > even been looked at in at least 90 days already, does it really matter? > I.e. should I just add the NSDs to the capacity pool and be done with it? > > If it?s a good idea to create another pool, then I have a question about > mmapplypolicy and migrations. I believe I understand how things work, but > after spending over an hour looking at the documentation I cannot find > anything that explicitly confirms my understanding ? so if I have another > pool called oc that?s ~264 TB in size and I write a policy file that looks > like: > > define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) > > RULE 'ReallyOldStuff' > MIGRATE FROM POOL 'capacity' > TO POOL 'oc' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) > > RULE 'OldStuff' > MIGRATE FROM POOL 'data' > TO POOL 'capacity' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) > > Keeping in mind that my capacity pool is already 98% full, is > mmapplypolicy smart enough to calculate how much space it?s going to free > up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able > to potentially also move a ton of stuff from the data pool to the capacity > pool via the 2nd rule with just one invocation of mmapplypolicy? That?s > what I expect that it will do. I?m hoping I don?t have to run the > mmapplypolicy twice ? the first to move stuff from capacity to oc and then > a second time for it to realize, oh, I?ve got a much of space free in the > capacity pool now. > > Thanks in advance... > > Kevin > > P.S. In case you?re scratching your head over the fact that we have files > that people haven?t even looked at for months and months (more than a year > in some cases) sitting out there ? we sell quota in 1 TB increments ? once > they?ve bought the quota, it?s theirs. As long as they?re paying us the > monthly fee if they want to keep files relating to research they did during > the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 > ?.then that?s their choice. We do not purge files. > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 08:49:42 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 08:49:42 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <83A6EEB0EC738F459A39439733AE8045267C22D4@MBX114.d.ethz.ch> Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Dec 19 09:29:08 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 19 Dec 2018 09:29:08 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 09:47:03 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 11:47:03 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Message-ID: Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 09:51:58 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 09:51:58 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:05:04 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:05:04 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Message-ID: Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 10:22:32 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 10:22:32 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:35:48 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:35:48 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> Message-ID: Hi, So, with all the usual disclaimers... mmfsadm saferdump verbs is not enough? or even mmfsadm saferdump verbs | grep VerbsRdmaStarted Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 12:22 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Dec 19 19:45:24 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 19 Dec 2018 14:45:24 -0500 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Regarding mixing different sized NSDs in the same pool... GPFS has gotten somewhat smarter about striping over the years and also offers some options about how blocks are allocated over NSDs.... And then there's mmrestripe and its several options/flavors... You probably do want to segregate NSDs into different pools if the performance varies significantly among them. SO old advice may not apply 100%. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Dec 19 19:13:16 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 19 Dec 2018 19:13:16 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Hadn?t seen a response, but here?s one thing that might make your decision easier on this question: ?But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it?? Does the performance matter for accessing files in this capacity pool? If not, then just add it in. If it does, then you?ll need to concern yourself with the performance you?ll get from the NSDs that still have free space to store new data once the smaller NSDs become full. If that?s enough then just add it in. Old data will still be spread across the current storage in the capacity pool, so you?ll get current read performance rates for that data. By creating a new pool, oc, and then migrating data that hasn?t been accessed in over 1 year to it from the capacity pool, you?re freeing up new space to store new data on the capacity pool. This seems to really only be a benefit if the performance of the capacity pool is a lot greater than the oc pool and your users need that performance to satisfy their application workloads. Of course moving data around on a regular basis also has an impact to overall performance during these operations too, but maybe there are times when the system is idle and these operations will not really cause any performance heartburn. I think Marc will have to answer your other question? ;o) Hope that helps! -Bryan From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Buterbaugh, Kevin L Sent: Monday, December 17, 2018 4:02 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy [EXTERNAL EMAIL] Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potential ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company?s treatment of personal data, please email datarequests at jumptrading.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 23 13:29:42 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 23 Dec 2018 14:29:42 +0100 Subject: [gpfsug-discuss] [Announcement] IBM Storage Enabler for Containers v2.0 has been released for general availability Message-ID: We're happy to announce the release of IBM Storage Enabler for Containers v2.0for general availability on Fix Central. IBM Storage Enabler for Containers v2.0extends IBM support for Kubernetes and IBM Cloud Private orchestrated container environments by supporting IBM Spectrum Scale(formerly IBM GPFS). IBM Storage Enabler for Containers v2.0introduces the following new functionalities IBM Spectrum Scale v5.0+support Support orchestration platforms IBM Cloud Private (ICP) v3.1.1 and Kubernetes v1.12 Support mixed deployment of Fibre Channel and iSCSI in the same cluster Kubernetes Service Accounts for more effective pod authorization procedure Required Components IBM Spectrum Connect v3.6 Installer for IBM Storage Enabler for Containers Fix Central publication link http://www.ibm.com/support/fixcentral/swg/quickorder?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM +Storage+Enabler+for +Containers&release=All&platform=All&function=all&source=fc Cheers,Tal Mailto: talsha at il.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From roblogie at au1.ibm.com Thu Dec 27 22:38:29 2018 From: roblogie at au1.ibm.com (Rob Logie) Date: Thu, 27 Dec 2018 22:38:29 +0000 Subject: [gpfsug-discuss] Introduction Message-ID: Hi I am a new member to the list. I am an IBMer (GBS) based in Ballarat Australia. Part of my role is supporting a small Spectrum Scale (GPFS) cluster on behalf of an IBM customer in Australia. Cluster is hosted on Redhat AWS EC2 instances and is accessed via CES SMB shares . Cheers (And happy Holidays !) Regards, Rob Logie IT Specialist IBM A/NZ GBS -------------- next part -------------- An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Sat Dec 1 10:52:21 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Sat, 1 Dec 2018 10:52:21 +0000 Subject: [gpfsug-discuss] Fwd: Status for Alert: remotely mounted filesystem panic on accessing cluster after upgrading the owning cluster first In-Reply-To: <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> References: , <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> Message-ID: <5DEF2C75-EA8F-4A2F-B0A7-391DCECFAB6E@huk-coburg.de> Hallo All, We updated today our owning cluster with 5.0.2.1.. After that we testet our Case and our Problem seems to be fixed. Thanks to all for the hints. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= From alvise.dorigo at psi.ch Thu Dec 6 09:22:48 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 09:22:48 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Dec 6 11:35:59 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 6 Dec 2018 17:05:59 +0530 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Message-ID: Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Dec 6 16:06:03 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 16:06:03 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADBC5@MBX114.d.ethz.ch> Hi, to be precise, it seems that Infiniband metrics has disappeared, as confirmed by its lack in the official documentation: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_listofmetricsPMT.htm So the refined question is: How can we monitor infiniband using zimon sensors ? Thanks, Alvise ________________________________ From: Karthik G Iyer1 [karthik.iyer at in.ibm.com] on behalf of IBM Spectrum Scale [scale at us.ibm.com] Sent: Thursday, December 06, 2018 12:35 PM To: Mathias Dietz Cc: gpfsug main discussion list; Dorigo Alvise (PSI) Subject: Re: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdratlif at iu.edu Thu Dec 6 16:36:35 2018 From: jdratlif at iu.edu (Ratliff, John) Date: Thu, 6 Dec 2018 16:36:35 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Message-ID: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> We're trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We're running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we've tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0 x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage - Indiana University | http://pti.iu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5670 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Thu Dec 6 17:03:28 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Thu, 6 Dec 2018 17:03:28 +0000 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Message-ID: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Dec 6 17:15:15 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 6 Dec 2018 12:15:15 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: Hopefully you are aware that GPFS 3.5 has been out of service since April 2017 unless you are on extended service. Might be a good time to consider upgrading. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Ratliff, John" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 11:53 AM Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Sent by: gpfsug-discuss-bounces at spectrumscale.org We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu [attachment "smime.p7s" deleted by Frederick Stock/Pittsburgh/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.robinson02 at gmail.com Thu Dec 6 17:41:50 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Thu, 6 Dec 2018 12:41:50 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: call IBM you ass,. submit an SOS report a\nd fuckofff' On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John wrote: > We?re trying to run a policy scan to get a list of all the files in one of > our filesets. There are approximately 600 million inodes in this space. > We?re running GPFS 3.5. Every time we run the policy scan, the node that is > running it ends up crashing. It makes it through a quarter of the inodes > before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS > logs shows anything. It just notes that the node rebooted. > > > > In the crash logs of all the systems we?ve tried this on, we see the same > line. > > > > <1>BUG: unable to handle kernel NULL pointer dereference at > 00000000000000d8 > > <1>IP: [] > _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 > [mmfs26] > > > > Our policy scan rule is pretty simple: > > > > RULE 'list-homedirs' > > LIST 'list-homedirs' > > > > mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N > gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 > > > > Has anyone experienced something like this or have any suggestions on what > to do to avoid it? > > > > Thanks. > > > > John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? > Indiana University | http://pti.iu.edu > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Dec 6 18:47:05 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:47:05 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior Message-ID: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> I've been trying to chase down an error one of our users periodically sees with Intel MPI. The body of the error is this: This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno 25. - If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching). - If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option. ADIOI_Set_lock:: No locks available ADIOI_Set_lock:offset 0, length 8 When this happens, a new job is reading back-in the checkpoint files a previous job wrote. Consistently it's the reading in of previously written files that triggers this although the occurrence is sporadic and if the job retries enough times the error will go away. The really curious thing, is there is only one byte range lock per file per-node open at any time, so the error 37 (I know it says 25 but that's actually in hex even though it's not prefixed with 0x) of being out of byte range locks is a little odd to me. The default is 200 but we should be no way near that. I've been trying to frantically chase this down with various MPI reproducers but alas I came up short, until this morning, when I gave up on the MPI approach and tried something a little more simple. I've discovered that when: - A file is opened by node A (a key requirement to reproduce seems to be that node A is *also* the metanode for the file. I've not been able to reproduce if node A is *not* the metanode) - Node A Acquires a bunch of write locks in the file - Node B then also acquires a bunch of write locks in the file - Node B then acquires a bunch of read locks in the file - Node A then also acquires a bunch of read locks in the file At that last step, Node A will experience the errno 37 attempting to acquire read locks. Here are the actual commands to reproduce this (source code for fcntl_stress.c is attached): Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero of=/gpfs/aaronFS/testFile bs=1M count=4000 Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Now that I've typed this out, I realize this really should be a PMR not a post to the mailing list :) but I thought it was interesting and wanted to share. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 -------------- next part -------------- /* Aaron Knister Program to acquire a bunch of byte range locks in a file */ #include #include #include #include #include #include #include int main(int argc, char **argv) { char *filename; int fd; struct stat statBuf; int highRand; int lowRand; unsigned int l_start = 0; unsigned int l_len; int openMode; int lockType; struct flock lock; unsigned int stride; filename = argv[1]; stride = atoi(argv[2]); l_len = atoi(argv[3]); if ( argc > 4 ) { openMode = O_WRONLY; lockType = F_WRLCK; } else { openMode = O_RDONLY; lockType = F_RDLCK; } printf("Opening file '%s' in %s mode. stride = %d. l_len = %d\n", filename, (openMode == O_WRONLY) ? "write" : "read", stride, l_len); assert( (fd = open(filename, openMode)) >= 0 ); assert( fstat(fd, &statBuf) == 0 ); while(1) { if ( l_start >= statBuf.st_size ) { break; l_start = 0; } highRand = rand(); lowRand = rand(); lock.l_type = lockType; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; if (fcntl(fd, F_SETLKW, &lock) != 0) { fprintf(stderr, "Non-zero return from fcntl. errno = %d (%s)\n", errno, strerror(errno)); abort(); } lock.l_type = F_UNLCK; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; assert(fcntl(fd, F_SETLKW, &lock) != -1); l_start += stride; } } From aaron.s.knister at nasa.gov Thu Dec 6 18:56:44 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:56:44 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior In-Reply-To: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> References: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> Message-ID: <5a5c70c6-c67f-1b5e-77ae-ef38553cb7e4@nasa.gov> Just for the sake of completeness, when the test program fails in the expected fashion this is the message it prints: Opening file 'read' in /gpfs/aaronFS/testFile mode. stride = 1048576 l_len = 262144 Non-zero return from fcntl. errno = 37 (No locks available) Aborted -Aaron On 12/6/18 1:47 PM, Aaron Knister wrote: > I've been trying to chase down an error one of our users periodically > sees with Intel MPI. The body of the error is this: > > This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. > Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd > F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno > 25. > - If the file system is NFS, you need to use NFS version 3, ensure that > the lockd daemon is running on all the machines, and mount the directory > with the 'noac' option (no attribute caching). > - If the file system is LUSTRE, ensure that the directory is mounted > with the 'flock' option. > ADIOI_Set_lock:: No locks available > ADIOI_Set_lock:offset 0, length 8 > > When this happens, a new job is reading back-in the checkpoint files a > previous job wrote. Consistently it's the reading in of previously > written files that triggers this although the occurrence is sporadic and > if the job retries enough times the error will go away. > > The really curious thing, is there is only one byte range lock per file > per-node open at any time, so the error 37 (I know it says 25 but that's > actually in hex even though it's not prefixed with 0x) of being out of > byte range locks is a little odd to me. The default is 200 but we should > be no way near that. > > I've been trying to frantically chase this down with various MPI > reproducers but alas I came up short, until this morning, when I gave up > on the MPI approach and tried something a little more simple. I've > discovered that when: > > - A file is opened by node A (a key requirement to reproduce seems to be > that node A is *also* the metanode for the file. I've not been able to > reproduce if node A is *not* the metanode) > - Node A Acquires a bunch of write locks in the file > - Node B then also acquires a bunch of write locks in the file > - Node B then acquires a bunch of read locks in the file > - Node A then also acquires a bunch of read locks in the file > > At that last step, Node A will experience the errno 37 attempting to > acquire read locks. > > Here are the actual commands to reproduce this (source code for > fcntl_stress.c is attached): > > Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero > of=/gpfs/aaronFS/testFile bs=1M count=4000 > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > > Now that I've typed this out, I realize this really should be a PMR not > a post to the mailing list :) but I thought it was interesting and > wanted to share. > > -Aaron > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From S.J.Thompson at bham.ac.uk Thu Dec 6 19:24:33 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Thu, 6 Dec 2018 19:24:33 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu>, Message-ID: Just a gentle reminder that this is a community based list and that we expect people to be respectful of each other on the list. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of matthew.robinson02 at gmail.com [matthew.robinson02 at gmail.com] Sent: 06 December 2018 17:41 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS nodes crashing during policy scan On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John > wrote: We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. From aaron.s.knister at nasa.gov Fri Dec 7 13:38:35 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[InuTeq, LLC]) Date: Fri, 7 Dec 2018 13:38:35 +0000 Subject: [gpfsug-discuss] Test? Message-ID: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> I sent a couple messages to the list earlier that made it to the archives online but seemingly never made it to anyone else I talked to. I?m curious to see if this message goes through. -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Fri Dec 7 13:47:12 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Fri, 7 Dec 2018 14:47:12 +0100 Subject: [gpfsug-discuss] SAVE the date - Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 In-Reply-To: References: Message-ID: Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 https://www.ibm.com/events/wwe/grp/grp308.nsf/Agenda.xsp?openform&seminar=Z94GKRES&locale=de_DE Save the date :-) -frank- Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Fri Dec 7 22:05:53 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Fri, 7 Dec 2018 22:05:53 +0000 Subject: [gpfsug-discuss] Test? In-Reply-To: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> References: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From scale at us.ibm.com Fri Dec 7 22:55:06 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Fri, 7 Dec 2018 14:55:06 -0800 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -UserRequirements In-Reply-To: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> References: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Message-ID: Hello, Unfortunately, to allow bidirectional passwordless ssh between Linux/Windows (for sole purpose of mm* commands), the literal username 'root' is a requirement. Here are a few variations. 1. Use domain account 'root', where 'root' belongs to "Domain Admins" group. This is the easiest 1-step and the recommended way. or 2. Use domain account 'root', where 'root' does NOT belong to "Domain Admins" group. In this case, on each and every GPFS Windows node, add this 'domain\root' account to local "Administrators" group. or 3. On each and every GPFS Windows node, create a local 'root' account as a member of local "Administrators" group. (1) and (2) work well reliably with Cygwin. I have seen inconsistent results with approach (3) wherein Cygwin passwordless ssh in incoming direction (linux->windows) sometimes breaks and prompts for password. Give it a try to see if you get better results. If you cannot get around the 'root' literal username requirement, the suggested alternative is to use GPFS multi-clustering. Create a separate cluster of all Windows-only nodes (using mmwinrsh/mmwinrcp instead of ssh/scp... so that 'root' requirement is eliminated). And then remote mount from the Linux cluster (all non-Windows nodes) via mmauth, mmremotecluster and mmremotefs et al. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Grunenberg, Renar" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 09:05 AM Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Sent by: gpfsug-discuss-bounces at spectrumscale.org Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Tue Dec 11 11:17:10 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Tue, 11 Dec 2018 12:17:10 +0100 Subject: [gpfsug-discuss] Spectrum Scale *News* Dec 12th 2018 Message-ID: IBM SpectrumAI with NVIDIA DGX is the key to data science productivity https://www.ibm.com/blogs/systems/introducing-spectrumai-with-nvidia-dgx/ To drive AI development productivity and streamline the AI data pipeline, IBM is introducing IBM Spectrum AI with NVIDIA DGX. This converged solution combines the industry acclaimed software-defined storage scale-out file system, IBM Spectrum Scale on flash with NVIDIA DGX-1. It provides the highest performance in any tested converged system with the unique ability to support a growing data science practice. https://public.dhe.ibm.com/common/ssi/ecm/81/en/81022381usen/ibm-spectrumai-ref-arch-dec10-v6_81022381USEN.pdf Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Dec 13 13:52:42 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 13 Dec 2018 14:52:42 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From jcram at ddn.com Fri Dec 14 14:45:52 2018 From: jcram at ddn.com (Jeno Cram) Date: Fri, 14 Dec 2018 14:45:52 +0000 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: Are you using Extended attributes on the directories in question? Jeno Cram | Systems Engineer Mobile: 517-980-0495 jcram at ddn.com DDN.com ?On 12/13/18, 9:02 AM, "Ulrich Sibiller" wrote: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Thu Dec 13 20:54:39 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 13 Dec 2018 20:54:39 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? Message-ID: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Hi All, Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? Thanks all? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sat Dec 15 14:44:25 2018 From: aaron.knister at gmail.com (Aaron Knister) Date: Sat, 15 Dec 2018 09:44:25 -0500 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <9955EFB8-B50B-4F8C-8803-468626BE23FB@gmail.com> Hi Kevin, I don?t have any experience running GPFS over iSCSI (although does iSER count?). For what it?s worth there are, I believe, 2 vendors in the FC space? Cisco and Brocade. You can get a fully licensed 16GB Cisco MDS edge switch for not a whole lot (https://m.cdw.com/product/cisco-mds-9148s-switch-48-ports-managed-rack-mountable/3520640). If you start with fewer ports licensed the cost goes down dramatically. I?ve also found that, if it?s an option, IB is stupidly cheap from a dollar per unit of bandwidth perspective and makes a great SAN backend. One other, other thought is that FCoE seems very attractive. Not sure if your arrays support that but I believe you get closer performance and behavior to FC with FCoE than with iSCSI and I don?t think there?s a huge cost difference. It?s even more fun if you have multi-fabric FC switches that can do FC and FCoE because you can in theory bridge the two fabrics (e.g. use FCoE on your NSD servers to some 40G switches that support DCB and connect the 40G eth switches to an FC/FCoE switch and then address your 8Gb FC storage and FCoE storage using the same fabric). -Aaron Sent from my iPhone > On Dec 13, 2018, at 15:54, Buterbaugh, Kevin L wrote: > > Hi All, > > Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? > > The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. > > As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. > > So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? > > As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! > > Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? > > Thanks all? > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 16 11:59:39 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 16 Dec 2018 12:59:39 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Sun Dec 16 13:45:47 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Sun, 16 Dec 2018 14:45:47 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer : > Kevin, > > Ethernet networking of today is changing very fast as the driving forces > are the "Hyperscale" datacenters. This big innovation is changing the world > and is happening right now. You must understand the conversation by > breaking down the differences between ASICs, FPGAs, and NPUs in modern > Ethernet networking. > > 1) Mellanox has a very good answer here based on the Spectrum-2 chip > http://www.mellanox.com/page/press_release_item?id=1933 > > 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 > Ethernet Switch Series > > https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series > > https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html > > 3) Barefoots Tofinio2 is another valid answer to this problem as it's > programmable with the P4 language (important for Hyperscale Datacenters) > https://www.barefootnetworks.com/ > > The P4 language itself is open source. There?s details at p4.org, or you > can download code at GitHub: https://github.com/p4lang/ > > 4) The last newcomer to this party comes from Innovium named Teralynx > https://innovium.com/products/teralynx/ > > https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ > > (Most of the new Cisco switches are powered by the Teralynx silicon, as > Cisco seems to be late to this game with it's own development.) > > So back your question - iSCSI is not the future! NVMe and it's variants is > the way to go and these new ethernet swichting products does have this in > focus. > Due to the performance demands of NVMe, high performance and low latency > networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are > the leading choices. > > -frank- > > P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be > a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to > make use of all these new things and options :-) > > Frank Kraemer > IBM Consulting IT Specialist / Client Technical Architect > Am Weiher 24, 65451 Kelsterbach, Germany > mailto:kraemerf at de.ibm.com > Mobile +49171-3043699 > IBM Germany > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Sun Dec 16 17:25:13 2018 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Sun, 16 Dec 2018 17:25:13 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Using iSCSI with Spectrum Scale is definitely do-able. As with running Scale in general, your networking needs to be very solid. For iSCSI the best practice I?m aware of is the dedicated/simple approach described by JF below: one subnet per switch (failure domain), nothing fancy like VRRP/HSRP/STP, and let multipathd do its job at ensuring that the available paths are the ones being used. We have also had some good experiences using routed iSCSI (which fits the rackscale/hyperscale style deployment model too, but this implies that you have a good QoS plan to assure that markings are correct and any link which can become congested can?t completely starve the dedicated queue you should be using for iSCSI. It?s also a good practice for the other TCP traffic in your non-iSCSI queue to use ECN in order to keep switch buffer utilization low. (As of today, I haven?t seen any iSCSI arrays which support ECN.) If you?re sharing arrays with multiple clusters/filesystems (i.e. not a single workload), then I would also recommend using iSCSI arrays which support per-volume/volume-group QOS limits to avoid noisy-neighbor problems in the iSCSI realm. As of today, there are even 100GbE capable all-flash solutions available which work well with Scale. Lastly, I?d say that iSCSI might not be the future? but NVMeOF hasn?t exactly given us many products ready to be the present. Most of the early offerings in this space are under-featured, over-priced, inflexible, proprietary, or fragile. We are successfully using non-standards based NVMe solutions today with Scale, but they have much more stringent and sensitive networking requirements (e.g. non-routed dedicated networking with PFC for RoCE) in order to provide reliable performance. So far, we?ve found these early offerings best-suited for single-workload use cases. I do expect this to continue to develop and improve on price, features, reliability/fragility. Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Jan-Frode Myklebust Sent: Sunday, December 16, 2018 8:46 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Anybody running GPFS over iSCSI? - I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer >: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 00:21:57 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 00:21:57 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: [SNIP] > > Two things that I am already aware of are: ?1) use jumbo frames, and 2) > run iSCSI over it?s own private network. ?Other things I should be aware > of?!? > Yes, don't do it. Really do not do it unless you have datacenter Ethernet switches and adapters. Those are the ones required for FCoE. Basically unless you have per channel pause on your Ethernet fabric then performance will at some point all go to shit. So what happens is your NSD makes a whole bunch of requests to read blocks off the storage array. Requests are small, response is not. The response can overwhelms the Ethernet channel at which point performance falls through the floor. Now you might be lucky not to see this, especially if you have say have 10Gbps links from the storage and 40Gbps links to the NSD servers, but you are taking a gamble. Also the more storage arrays you have the more likely you are to see the problem. To fix this you have two options. The first is datacenter Ethernet with per channel pause. This option is expensive, probably in the same ball park as fibre channel. At least it was last time I looked, though this was some time ago now. The second option is dedicated links between the storage array and the NSD server. That is the cable goes directly between the storage array and the NSD server with no switches involved. This option is a maintenance nightmare. At he site where I did this, we had to go option two because I need to make it work, We ended up ripping it all out are replacing with FC. Personally I would see what price you can get DSS storage for, or use SAS arrays. Note iSCSI can in theory work, it's just the issue with GPFS scattering stuff to the winds over multiple storage arrays so your ethernet channel gets swamped and standard ethernet pauses all the upstream traffic. The vast majority of iSCSI use cases don't see this effect. There is a reason that to run FC over ethernet they had to turn ethernet lossless. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From janfrode at tanso.net Mon Dec 17 07:50:01 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 17 Dec 2018 08:50:01 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: I?d be curious to hear if all these arguments against iSCSI shouldn?t also apply to NSD protocol over TCP/IP? -jf man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard < jonathan.buzzard at strath.ac.uk>: > On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > Two things that I am already aware of are: 1) use jumbo frames, and 2) > > run iSCSI over it?s own private network. Other things I should be aware > > of?!? > > > > Yes, don't do it. Really do not do it unless you have datacenter > Ethernet switches and adapters. Those are the ones required for FCoE. > Basically unless you have per channel pause on your Ethernet fabric then > performance will at some point all go to shit. > > So what happens is your NSD makes a whole bunch of requests to read > blocks off the storage array. Requests are small, response is not. The > response can overwhelms the Ethernet channel at which point performance > falls through the floor. Now you might be lucky not to see this, > especially if you have say have 10Gbps links from the storage and 40Gbps > links to the NSD servers, but you are taking a gamble. Also the more > storage arrays you have the more likely you are to see the problem. > > To fix this you have two options. The first is datacenter Ethernet with > per channel pause. This option is expensive, probably in the same ball > park as fibre channel. At least it was last time I looked, though this > was some time ago now. > > The second option is dedicated links between the storage array and the > NSD server. That is the cable goes directly between the storage array > and the NSD server with no switches involved. This option is a > maintenance nightmare. > > At he site where I did this, we had to go option two because I need to > make it work, We ended up ripping it all out are replacing with FC. > > Personally I would see what price you can get DSS storage for, or use > SAS arrays. > > Note iSCSI can in theory work, it's just the issue with GPFS scattering > stuff to the winds over multiple storage arrays so your ethernet channel > gets swamped and standard ethernet pauses all the upstream traffic. The > vast majority of iSCSI use cases don't see this effect. > > There is a reason that to run FC over ethernet they had to turn ethernet > lossless. > > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Mon Dec 17 08:50:54 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Mon, 17 Dec 2018 08:50:54 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: , <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Dec 17 10:22:26 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 17 Dec 2018 11:22:26 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: <57c7a736-0774-2701-febc-3cdc57a50d86@science-computing.de> On 14.12.2018 15:45, Jeno Cram wrote: > Are you using Extended attributes on the directories in question? No. What's the background of your question? Kind regards, Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From S.J.Thompson at bham.ac.uk Mon Dec 17 15:07:39 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 15:07:39 +0000 Subject: [gpfsug-discuss] GPFS API Message-ID: Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Dec 17 15:47:18 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 17 Dec 2018 10:47:18 -0500 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: Look in gps.h.... I think the comment for acl_buffer_len is clear enough! /* Mapping of buffer for gpfs_getacl, gpfs_putacl. */ typedef struct gpfs_opaque_acl { int acl_buffer_len; /* INPUT: Total size of buffer (including this field). OUTPUT: Actual size of the ACL information. */ unsigned short acl_version; /* INPUT: Set to zero. OUTPUT: Current version of the returned ACL. */ unsigned char acl_type; /* INPUT: Type of ACL: access (1) or default (2). */ char acl_var_data[1]; /* OUTPUT: Remainder of the ACL information. */ } gpfs_opaque_acl_t; From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 12/17/2018 10:13 AM Subject: [gpfsug-discuss] GPFS API Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:22:35 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:22:35 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <7d438bada357989f56eeb6f347ba2ddb5661679a.camel@strath.ac.uk> On Mon, 2018-12-17 at 08:50 +0100, Jan-Frode Myklebust wrote: > I?d be curious to hear if all these arguments against iSCSI shouldn?t > also apply to NSD protocol over TCP/IP? > They don't is the simple answer. Either theoretically or practically. It won't necessarily be a problem for iSCSI either. It is the potential for wild over subscription of the links with the total naivety of the block based iSCSI protocol that is the issue. I suspect that with NSD traffic it is self limiting. That said looking about higher end switches seem to do DCE now, though FCoE seems to have gone nowhere. Anyway as I said if you want lower cost use SAS. Even if you don't want to do ESS/SSS you can architecture something very similar using SAS based arrays from your favourite vendor, and just skip the native RAID bit. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:46:32 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:46:32 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Mon Dec 17 17:35:31 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 17:35:31 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> References: , <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: Indeed. Actually this was exactly what we were trying to work out. We'd set the buf size to 0 hoping it would tell us how much we need, but we kept getting EINVAL back - which the docs states is invalid path, but actually it can be invalid bufsize as well apparently (the header file comments are different again to the docs). Anyway, we're looking at patching mpifileutils to support GPFS ACLs to help with migration of between old and new file-systems. I was actually using the opaque call on the assumption that it would be a binary blob of data I could poke to the new file. (I was too scared to use the attr functions as that copies DMAPI info as well and I'm not sure I want to "copy" my ILM files without recall!). Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing something to do with inheritance maybe? Thanks Marc, Jonathan for the pointers! Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 16:46 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS API On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Mon Dec 17 22:44:14 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 22:44:14 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 23:04:15 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 23:04:15 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <6e34f47e-4fc0-9726-0c1a-38720f5c26db@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). Forgot to say performance wise calling with effectively a zero buffer size and then calling again with precisely the right buffer size is not sensible plan. Basically you end up having to do two API calls instead of one to save less than 1KB of RAM in 99.999% of cases. Let's face it any machine running GPFS is going to have GB of RAM. That said I am not sure even allocating 1KB of RAM is sensible either. One suspects allocating less than a whole page might have performance implications. What I do know from testing is that, two API calls when iterating over millions of files has a significant impact on run time over just allocating a bunch of memory up front on only making the one call. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Tue Dec 18 07:05:18 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 18 Dec 2018 07:05:18 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> , <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> Message-ID: No, it comes back as the first four bytes of the structure in the size field. So if we set it the data field to zero, then it still has space in the size filed to return the required size. That was sort of the point.... The size value needs to be the size of the struct rather than the data field within it. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 22:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS API On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Mon Dec 17 22:01:41 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 17 Dec 2018 22:01:41 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy Message-ID: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Dec 18 09:13:05 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 18 Dec 2018 10:13:05 +0100 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: I don't think it's going to figure this out automatically between the two rules.. I believe you will need to do something like the below (untested, and definitely not perfect!!) rebalancing: define( weight_junkedness, (CASE /* Create 3 classes of files */ /* ReallyOldStuff */ WHEN ((access_age > 365) AND (KB_ALLOCATED > 3584)) THEN 1000 /* OldStuff */ WHEN ((access_age > 90) AND (KB_ALLOCATED > 3584)) THEN 100 /* everything else */ ELSE 0 END) ) RULE 'defineTiers' GROUP POOL 'TIERS' IS 'data' LIMIT(98) THEN 'capacity' LIMIT(98) THEN 'oc' RULE 'Rebalance' MIGRATE FROM POOL 'TIERS' TO POOL 'TIERS' WEIGHT(weight_junkedness) Based on /usr/lpp/mmfs/samples/ilm/mmpolicy-fileheat.sample. -jf On Tue, Dec 18, 2018 at 9:10 AM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi All, > > As those of you who suffered thru my talk at SC18 already know, we?re > really short on space on one of our GPFS filesystems as the output of mmdf > piped to grep pool shows: > > Disks in storage pool: system (Maximum disk size allowed is 24 TB) > (pool total) 4.318T 1.078T ( 25%) > 79.47G ( 2%) > Disks in storage pool: data (Maximum disk size allowed is 262 TB) > (pool total) 494.7T 38.15T ( 8%) > 4.136T ( 1%) > Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) > (pool total) 640.2T 14.56T ( 2%) > 716.4G ( 0%) > > The system pool is metadata only. The data pool is the default pool. The > capacity pool is where files with an atime (yes, atime) > 90 days get > migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs > of 8 TB drives, so roughly 58.2 TB usable space per NSD. > > We have the new storage we purchased, but that?s still being tested and > held in reserve for after the first of the year when we create a new GPFS 5 > formatted filesystem and start migrating everything to the new filesystem. > > In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB > drives and will be hooking it up to one of our existing storage arrays on > Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present > those to GPFS as NSDs. They will be about 88 TB usable space each (because > ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me > started on so-called ?4K? TV?s ? end rant). > > A very wise man who used to work at IBM but now hangs out with people in > red polos () once told me that it?s OK to mix NSDs of slightly > different sizes in the same pool, but you don?t want to put NSDs of vastly > different sizes in the same pool because the smaller ones will fill first > and then the larger ones will have to take all the I/O. I consider 58 TB > and 88 TB to be pretty significantly different and am therefore planning on > creating yet another pool called ?oc? (over capacity if a user asks, old > crap internally!) and migrating files with an atime greater than, say, 1 > year to that pool. But since ALL of the files in the capacity pool haven?t > even been looked at in at least 90 days already, does it really matter? > I.e. should I just add the NSDs to the capacity pool and be done with it? > > If it?s a good idea to create another pool, then I have a question about > mmapplypolicy and migrations. I believe I understand how things work, but > after spending over an hour looking at the documentation I cannot find > anything that explicitly confirms my understanding ? so if I have another > pool called oc that?s ~264 TB in size and I write a policy file that looks > like: > > define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) > > RULE 'ReallyOldStuff' > MIGRATE FROM POOL 'capacity' > TO POOL 'oc' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) > > RULE 'OldStuff' > MIGRATE FROM POOL 'data' > TO POOL 'capacity' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) > > Keeping in mind that my capacity pool is already 98% full, is > mmapplypolicy smart enough to calculate how much space it?s going to free > up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able > to potentially also move a ton of stuff from the data pool to the capacity > pool via the 2nd rule with just one invocation of mmapplypolicy? That?s > what I expect that it will do. I?m hoping I don?t have to run the > mmapplypolicy twice ? the first to move stuff from capacity to oc and then > a second time for it to realize, oh, I?ve got a much of space free in the > capacity pool now. > > Thanks in advance... > > Kevin > > P.S. In case you?re scratching your head over the fact that we have files > that people haven?t even looked at for months and months (more than a year > in some cases) sitting out there ? we sell quota in 1 TB increments ? once > they?ve bought the quota, it?s theirs. As long as they?re paying us the > monthly fee if they want to keep files relating to research they did during > the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 > ?.then that?s their choice. We do not purge files. > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 08:49:42 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 08:49:42 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <83A6EEB0EC738F459A39439733AE8045267C22D4@MBX114.d.ethz.ch> Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Dec 19 09:29:08 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 19 Dec 2018 09:29:08 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 09:47:03 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 11:47:03 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Message-ID: Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 09:51:58 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 09:51:58 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:05:04 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:05:04 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Message-ID: Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 10:22:32 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 10:22:32 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:35:48 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:35:48 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> Message-ID: Hi, So, with all the usual disclaimers... mmfsadm saferdump verbs is not enough? or even mmfsadm saferdump verbs | grep VerbsRdmaStarted Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 12:22 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Dec 19 19:45:24 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 19 Dec 2018 14:45:24 -0500 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Regarding mixing different sized NSDs in the same pool... GPFS has gotten somewhat smarter about striping over the years and also offers some options about how blocks are allocated over NSDs.... And then there's mmrestripe and its several options/flavors... You probably do want to segregate NSDs into different pools if the performance varies significantly among them. SO old advice may not apply 100%. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Dec 19 19:13:16 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 19 Dec 2018 19:13:16 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Hadn?t seen a response, but here?s one thing that might make your decision easier on this question: ?But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it?? Does the performance matter for accessing files in this capacity pool? If not, then just add it in. If it does, then you?ll need to concern yourself with the performance you?ll get from the NSDs that still have free space to store new data once the smaller NSDs become full. If that?s enough then just add it in. Old data will still be spread across the current storage in the capacity pool, so you?ll get current read performance rates for that data. By creating a new pool, oc, and then migrating data that hasn?t been accessed in over 1 year to it from the capacity pool, you?re freeing up new space to store new data on the capacity pool. This seems to really only be a benefit if the performance of the capacity pool is a lot greater than the oc pool and your users need that performance to satisfy their application workloads. Of course moving data around on a regular basis also has an impact to overall performance during these operations too, but maybe there are times when the system is idle and these operations will not really cause any performance heartburn. I think Marc will have to answer your other question? ;o) Hope that helps! -Bryan From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Buterbaugh, Kevin L Sent: Monday, December 17, 2018 4:02 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy [EXTERNAL EMAIL] Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potential ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company?s treatment of personal data, please email datarequests at jumptrading.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 23 13:29:42 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 23 Dec 2018 14:29:42 +0100 Subject: [gpfsug-discuss] [Announcement] IBM Storage Enabler for Containers v2.0 has been released for general availability Message-ID: We're happy to announce the release of IBM Storage Enabler for Containers v2.0for general availability on Fix Central. IBM Storage Enabler for Containers v2.0extends IBM support for Kubernetes and IBM Cloud Private orchestrated container environments by supporting IBM Spectrum Scale(formerly IBM GPFS). IBM Storage Enabler for Containers v2.0introduces the following new functionalities IBM Spectrum Scale v5.0+support Support orchestration platforms IBM Cloud Private (ICP) v3.1.1 and Kubernetes v1.12 Support mixed deployment of Fibre Channel and iSCSI in the same cluster Kubernetes Service Accounts for more effective pod authorization procedure Required Components IBM Spectrum Connect v3.6 Installer for IBM Storage Enabler for Containers Fix Central publication link http://www.ibm.com/support/fixcentral/swg/quickorder?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM +Storage+Enabler+for +Containers&release=All&platform=All&function=all&source=fc Cheers,Tal Mailto: talsha at il.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From roblogie at au1.ibm.com Thu Dec 27 22:38:29 2018 From: roblogie at au1.ibm.com (Rob Logie) Date: Thu, 27 Dec 2018 22:38:29 +0000 Subject: [gpfsug-discuss] Introduction Message-ID: Hi I am a new member to the list. I am an IBMer (GBS) based in Ballarat Australia. Part of my role is supporting a small Spectrum Scale (GPFS) cluster on behalf of an IBM customer in Australia. Cluster is hosted on Redhat AWS EC2 instances and is accessed via CES SMB shares . Cheers (And happy Holidays !) Regards, Rob Logie IT Specialist IBM A/NZ GBS -------------- next part -------------- An HTML attachment was scrubbed... URL: From Renar.Grunenberg at huk-coburg.de Sat Dec 1 10:52:21 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Sat, 1 Dec 2018 10:52:21 +0000 Subject: [gpfsug-discuss] Fwd: Status for Alert: remotely mounted filesystem panic on accessing cluster after upgrading the owning cluster first In-Reply-To: <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> References: , <88303B8D-1447-45BC-AAF8-A16000C4B4DC@gmx.de> Message-ID: <5DEF2C75-EA8F-4A2F-B0A7-391DCECFAB6E@huk-coburg.de> Hallo All, We updated today our owning cluster with 5.0.2.1.. After that we testet our Case and our Problem seems to be fixed. Thanks to all for the hints. Regards Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ======================================================================= HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ======================================================================= Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ======================================================================= From alvise.dorigo at psi.ch Thu Dec 6 09:22:48 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 09:22:48 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Dec 6 11:35:59 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 6 Dec 2018 17:05:59 +0530 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch> Message-ID: Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Dec 6 16:06:03 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 6 Dec 2018 16:06:03 +0000 Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 In-Reply-To: References: <83A6EEB0EC738F459A39439733AE8045267ADADC@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267ADBC5@MBX114.d.ethz.ch> Hi, to be precise, it seems that Infiniband metrics has disappeared, as confirmed by its lack in the official documentation: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_listofmetricsPMT.htm So the refined question is: How can we monitor infiniband using zimon sensors ? Thanks, Alvise ________________________________ From: Karthik G Iyer1 [karthik.iyer at in.ibm.com] on behalf of IBM Spectrum Scale [scale at us.ibm.com] Sent: Thursday, December 06, 2018 12:35 PM To: Mathias Dietz Cc: gpfsug main discussion list; Dorigo Alvise (PSI) Subject: Re: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Hi Mathias, Can you help with below query. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Dorigo Alvise (PSI)" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 02:53 PM Subject: [gpfsug-discuss] PMSensors 5.0.2 vs PMCollector 4.2.3 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Good morning, I wonder if pmsensors 5.x is supported with a 4.2.3 collector. Since I've upgraded a couple of nodes (afm gateways) to GPFS 5, while the rest of the cluster is still running 4.2.3-7 (including the collector), I haven't got anymore metrics from the upgraded nodes. Further, I do not see any error in /var/log/zimon/ZIMonSensors.log neither in /var/log/zimon/ZIMonCollector.log Does anybody has any idea ? thanks, Regards. Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdratlif at iu.edu Thu Dec 6 16:36:35 2018 From: jdratlif at iu.edu (Ratliff, John) Date: Thu, 6 Dec 2018 16:36:35 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Message-ID: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> We're trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We're running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we've tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0 x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage - Indiana University | http://pti.iu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5670 bytes Desc: not available URL: From Renar.Grunenberg at huk-coburg.de Thu Dec 6 17:03:28 2018 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Thu, 6 Dec 2018 17:03:28 +0000 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Message-ID: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Dec 6 17:15:15 2018 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 6 Dec 2018 12:15:15 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: Hopefully you are aware that GPFS 3.5 has been out of service since April 2017 unless you are on extended service. Might be a good time to consider upgrading. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com From: "Ratliff, John" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 11:53 AM Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan Sent by: gpfsug-discuss-bounces at spectrumscale.org We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu [attachment "smime.p7s" deleted by Frederick Stock/Pittsburgh/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.robinson02 at gmail.com Thu Dec 6 17:41:50 2018 From: matthew.robinson02 at gmail.com (Matthew Robinson) Date: Thu, 6 Dec 2018 12:41:50 -0500 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu> Message-ID: call IBM you ass,. submit an SOS report a\nd fuckofff' On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John wrote: > We?re trying to run a policy scan to get a list of all the files in one of > our filesets. There are approximately 600 million inodes in this space. > We?re running GPFS 3.5. Every time we run the policy scan, the node that is > running it ends up crashing. It makes it through a quarter of the inodes > before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS > logs shows anything. It just notes that the node rebooted. > > > > In the crash logs of all the systems we?ve tried this on, we see the same > line. > > > > <1>BUG: unable to handle kernel NULL pointer dereference at > 00000000000000d8 > > <1>IP: [] > _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 > [mmfs26] > > > > Our policy scan rule is pretty simple: > > > > RULE 'list-homedirs' > > LIST 'list-homedirs' > > > > mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N > gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 > > > > Has anyone experienced something like this or have any suggestions on what > to do to avoid it? > > > > Thanks. > > > > John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? > Indiana University | http://pti.iu.edu > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Dec 6 18:47:05 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:47:05 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior Message-ID: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> I've been trying to chase down an error one of our users periodically sees with Intel MPI. The body of the error is this: This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno 25. - If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching). - If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option. ADIOI_Set_lock:: No locks available ADIOI_Set_lock:offset 0, length 8 When this happens, a new job is reading back-in the checkpoint files a previous job wrote. Consistently it's the reading in of previously written files that triggers this although the occurrence is sporadic and if the job retries enough times the error will go away. The really curious thing, is there is only one byte range lock per file per-node open at any time, so the error 37 (I know it says 25 but that's actually in hex even though it's not prefixed with 0x) of being out of byte range locks is a little odd to me. The default is 200 but we should be no way near that. I've been trying to frantically chase this down with various MPI reproducers but alas I came up short, until this morning, when I gave up on the MPI approach and tried something a little more simple. I've discovered that when: - A file is opened by node A (a key requirement to reproduce seems to be that node A is *also* the metanode for the file. I've not been able to reproduce if node A is *not* the metanode) - Node A Acquires a bunch of write locks in the file - Node B then also acquires a bunch of write locks in the file - Node B then acquires a bunch of read locks in the file - Node A then also acquires a bunch of read locks in the file At that last step, Node A will experience the errno 37 attempting to acquire read locks. Here are the actual commands to reproduce this (source code for fcntl_stress.c is attached): Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero of=/gpfs/aaronFS/testFile bs=1M count=4000 Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) 1 Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) Now that I've typed this out, I realize this really should be a PMR not a post to the mailing list :) but I thought it was interesting and wanted to share. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 -------------- next part -------------- /* Aaron Knister Program to acquire a bunch of byte range locks in a file */ #include #include #include #include #include #include #include int main(int argc, char **argv) { char *filename; int fd; struct stat statBuf; int highRand; int lowRand; unsigned int l_start = 0; unsigned int l_len; int openMode; int lockType; struct flock lock; unsigned int stride; filename = argv[1]; stride = atoi(argv[2]); l_len = atoi(argv[3]); if ( argc > 4 ) { openMode = O_WRONLY; lockType = F_WRLCK; } else { openMode = O_RDONLY; lockType = F_RDLCK; } printf("Opening file '%s' in %s mode. stride = %d. l_len = %d\n", filename, (openMode == O_WRONLY) ? "write" : "read", stride, l_len); assert( (fd = open(filename, openMode)) >= 0 ); assert( fstat(fd, &statBuf) == 0 ); while(1) { if ( l_start >= statBuf.st_size ) { break; l_start = 0; } highRand = rand(); lowRand = rand(); lock.l_type = lockType; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; if (fcntl(fd, F_SETLKW, &lock) != 0) { fprintf(stderr, "Non-zero return from fcntl. errno = %d (%s)\n", errno, strerror(errno)); abort(); } lock.l_type = F_UNLCK; lock.l_whence = 0; lock.l_start = l_start; lock.l_len = l_len; assert(fcntl(fd, F_SETLKW, &lock) != -1); l_start += stride; } } From aaron.s.knister at nasa.gov Thu Dec 6 18:56:44 2018 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 6 Dec 2018 13:56:44 -0500 Subject: [gpfsug-discuss] Bizarre fcntl locking behavior In-Reply-To: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> References: <7ec54f20-77c9-8534-c3f4-3cd270f1c2a3@nasa.gov> Message-ID: <5a5c70c6-c67f-1b5e-77ae-ef38553cb7e4@nasa.gov> Just for the sake of completeness, when the test program fails in the expected fashion this is the message it prints: Opening file 'read' in /gpfs/aaronFS/testFile mode. stride = 1048576 l_len = 262144 Non-zero return from fcntl. errno = 37 (No locks available) Aborted -Aaron On 12/6/18 1:47 PM, Aaron Knister wrote: > I've been trying to chase down an error one of our users periodically > sees with Intel MPI. The body of the error is this: > > This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. > Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd F,cmd > F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno > 25. > - If the file system is NFS, you need to use NFS version 3, ensure that > the lockd daemon is running on all the machines, and mount the directory > with the 'noac' option (no attribute caching). > - If the file system is LUSTRE, ensure that the directory is mounted > with the 'flock' option. > ADIOI_Set_lock:: No locks available > ADIOI_Set_lock:offset 0, length 8 > > When this happens, a new job is reading back-in the checkpoint files a > previous job wrote. Consistently it's the reading in of previously > written files that triggers this although the occurrence is sporadic and > if the job retries enough times the error will go away. > > The really curious thing, is there is only one byte range lock per file > per-node open at any time, so the error 37 (I know it says 25 but that's > actually in hex even though it's not prefixed with 0x) of being out of > byte range locks is a little odd to me. The default is 200 but we should > be no way near that. > > I've been trying to frantically chase this down with various MPI > reproducers but alas I came up short, until this morning, when I gave up > on the MPI approach and tried something a little more simple. I've > discovered that when: > > - A file is opened by node A (a key requirement to reproduce seems to be > that node A is *also* the metanode for the file. I've not been able to > reproduce if node A is *not* the metanode) > - Node A Acquires a bunch of write locks in the file > - Node B then also acquires a bunch of write locks in the file > - Node B then acquires a bunch of read locks in the file > - Node A then also acquires a bunch of read locks in the file > > At that last step, Node A will experience the errno 37 attempting to > acquire read locks. > > Here are the actual commands to reproduce this (source code for > fcntl_stress.c is attached): > > Node A: rm /gpfs/aaronFS/testFile; dd if=/dev/zero > of=/gpfs/aaronFS/testFile bs=1M count=4000 > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) > $((256*1024)) 1 > Node B: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > Node A: ./fcntl_stress /gpfs/aaronFS/testFile $((1024*1024)) $((256*1024)) > > Now that I've typed this out, I realize this really should be a PMR not > a post to the mailing list :) but I thought it was interesting and > wanted to share. > > -Aaron > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From S.J.Thompson at bham.ac.uk Thu Dec 6 19:24:33 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Thu, 6 Dec 2018 19:24:33 +0000 Subject: [gpfsug-discuss] GPFS nodes crashing during policy scan In-Reply-To: References: <050dcbb86b6e443daeb7dfd10b6b4e10@IN-CCI-D1S14.ads.iu.edu>, Message-ID: Just a gentle reminder that this is a community based list and that we expect people to be respectful of each other on the list. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of matthew.robinson02 at gmail.com [matthew.robinson02 at gmail.com] Sent: 06 December 2018 17:41 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS nodes crashing during policy scan On Thu, Dec 6, 2018 at 11:46 AM Ratliff, John > wrote: We?re trying to run a policy scan to get a list of all the files in one of our filesets. There are approximately 600 million inodes in this space. We?re running GPFS 3.5. Every time we run the policy scan, the node that is running it ends up crashing. It makes it through a quarter of the inodes before crashing (i.e. kernel panic and system reboot). Nothing in the GPFS logs shows anything. It just notes that the node rebooted. In the crash logs of all the systems we?ve tried this on, we see the same line. <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8 <1>IP: [] _ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590 [mmfs26] Our policy scan rule is pretty simple: RULE 'list-homedirs' LIST 'list-homedirs' mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1 Has anyone experienced something like this or have any suggestions on what to do to avoid it? Thanks. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Matthew Robinson Comptia A+, Net+ 919.909.0494 matthew.robinson02 at gmail.com The greatest discovery of my generation is that man can alter his life simply by altering his attitude of mind. - William James, Harvard Psychologist. From aaron.s.knister at nasa.gov Fri Dec 7 13:38:35 2018 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[InuTeq, LLC]) Date: Fri, 7 Dec 2018 13:38:35 +0000 Subject: [gpfsug-discuss] Test? Message-ID: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> I sent a couple messages to the list earlier that made it to the archives online but seemingly never made it to anyone else I talked to. I?m curious to see if this message goes through. -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Fri Dec 7 13:47:12 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Fri, 7 Dec 2018 14:47:12 +0100 Subject: [gpfsug-discuss] SAVE the date - Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 In-Reply-To: References: Message-ID: Spectrum Scale Strategy Days 2019 at IBM Ehningen, Germany 19., 20./21. March 2018 https://www.ibm.com/events/wwe/grp/grp308.nsf/Agenda.xsp?openform&seminar=Z94GKRES&locale=de_DE Save the date :-) -frank- Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Fri Dec 7 22:05:53 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Fri, 7 Dec 2018 22:05:53 +0000 Subject: [gpfsug-discuss] Test? In-Reply-To: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> References: <9272165F-DF81-4B01-91B5-D04B07E17AC7@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From scale at us.ibm.com Fri Dec 7 22:55:06 2018 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Fri, 7 Dec 2018 14:55:06 -0800 Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -UserRequirements In-Reply-To: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> References: <50178605f3f643e8a72c5f7f3a547bc7@SMXRF105.msg.hukrf.de> Message-ID: Hello, Unfortunately, to allow bidirectional passwordless ssh between Linux/Windows (for sole purpose of mm* commands), the literal username 'root' is a requirement. Here are a few variations. 1. Use domain account 'root', where 'root' belongs to "Domain Admins" group. This is the easiest 1-step and the recommended way. or 2. Use domain account 'root', where 'root' does NOT belong to "Domain Admins" group. In this case, on each and every GPFS Windows node, add this 'domain\root' account to local "Administrators" group. or 3. On each and every GPFS Windows node, create a local 'root' account as a member of local "Administrators" group. (1) and (2) work well reliably with Cygwin. I have seen inconsistent results with approach (3) wherein Cygwin passwordless ssh in incoming direction (linux->windows) sometimes breaks and prompts for password. Give it a try to see if you get better results. If you cannot get around the 'root' literal username requirement, the suggested alternative is to use GPFS multi-clustering. Create a separate cluster of all Windows-only nodes (using mmwinrsh/mmwinrcp instead of ssh/scp... so that 'root' requirement is eliminated). And then remote mount from the Linux cluster (all non-Windows nodes) via mmauth, mmremotecluster and mmremotefs et al. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Grunenberg, Renar" To: "gpfsug-discuss at spectrumscale.org" Date: 12/06/2018 09:05 AM Subject: [gpfsug-discuss] Spectrum Scale for Windows Domain -User Requirements Sent by: gpfsug-discuss-bounces at spectrumscale.org Hallo All, i had a question about the domain-user root account on Windows. We have some requirements to restrict these level of authorization and found no info what is possible to change here. Two questions: 1. It is possible to define a other Domain-Account other than as root for this. 2. If not, is it possible to define a local account as root on Windows-Clients? Any hints are appreciate. Thanks Renar Renar Grunenberg Abteilung Informatik ? Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas. Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Tue Dec 11 11:17:10 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Tue, 11 Dec 2018 12:17:10 +0100 Subject: [gpfsug-discuss] Spectrum Scale *News* Dec 12th 2018 Message-ID: IBM SpectrumAI with NVIDIA DGX is the key to data science productivity https://www.ibm.com/blogs/systems/introducing-spectrumai-with-nvidia-dgx/ To drive AI development productivity and streamline the AI data pipeline, IBM is introducing IBM Spectrum AI with NVIDIA DGX. This converged solution combines the industry acclaimed software-defined storage scale-out file system, IBM Spectrum Scale on flash with NVIDIA DGX-1. It provides the highest performance in any tested converged system with the unique ability to support a growing data science practice. https://public.dhe.ibm.com/common/ssi/ecm/81/en/81022381usen/ibm-spectrumai-ref-arch-dec10-v6_81022381USEN.pdf Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Thu Dec 13 13:52:42 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Thu, 13 Dec 2018 14:52:42 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From jcram at ddn.com Fri Dec 14 14:45:52 2018 From: jcram at ddn.com (Jeno Cram) Date: Fri, 14 Dec 2018 14:45:52 +0000 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: Are you using Extended attributes on the directories in question? Jeno Cram | Systems Engineer Mobile: 517-980-0495 jcram at ddn.com DDN.com ?On 12/13/18, 9:02 AM, "Ulrich Sibiller" wrote: On 23.11.2018 14:41, Andreas Mattsson wrote: > Yes, this is repeating. > > We?ve ascertained that it has nothing to do at all with file operations on the GPFS side. > > Randomly throughout the filesystem mounted via NFS, ls or file access will give > > ? > > > ls: reading directory /gpfs/filessystem/test/testdir: Invalid argument > > ? > > Trying again later might work on that folder, but might fail somewhere else. > > We have tried exporting the same filesystem via a standard kernel NFS instead of the CES > Ganesha-NFS, and then the problem doesn?t exist. > > So it is definitely related to the Ganesha NFS server, or its interaction with the file system. > > Will see if I can get a tcpdump of the issue. We see this, too. We cannot trigger it. Fortunately I have managed to capture some logs with debugging enabled. I have now dug into the ganesha 2.5.3 code and I think the netgroup caching is the culprit. Here some FULL_DEBUG output: 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Check for address 1.2.3.4 for export id 1 path /gpfsexport 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcf7fe0 NETGROUP_CLIENT: netgroup1 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe320 NETGROUP_CLIENT: netgroup2 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] client_match :EXPORT :M_DBG :Match V4: 0xcfe380 NETGROUP_CLIENT: netgroup3 (options=421021e2root_squash , RWrw, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_ip_name_get :DISP :F_DBG :Cache get hit for 1.2.3.4->client1.domain 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT (options=03303002 , , , , , -- Deleg, , ) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :EXPORT_DEFAULTS (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , , anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :default options (options=03303002root_squash , ----, 34-, UDP, TCP, ----, No Manage_Gids, -- Deleg, anon_uid= -2, anon_gid= -2, none, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] export_check_access :EXPORT :M_DBG :Final options (options=42102002root_squash , ----, 3--, ---, TCP, ----, Manage_Gids , -- Deleg, anon_uid= -2, anon_gid= -2, sys) 2018-12-13 11:53:41 : epoch 0009008d : server1 : gpfs.ganesha.nfsd-258762[work-250] nfs_rpc_execute :DISP :INFO :DISP: INFO: Client ::ffff:1.2.3.4 is not allowed to access Export_Id 1 /gpfsexport, vers=3, proc=18 The client "client1" is definitely a member of the "netgroup1". But the NETGROUP_CLIENT lookups for "netgroup2" and "netgroup3" can only happen if the netgroup caching code reports that "client1" is NOT a member of "netgroup1". I have also opened a support case at IBM for this. @Malahal: Looks like you have written the netgroup caching code, feel free to ask for further details if required. Kind regards, Ulrich Sibiller -- Dipl.-Inf. Ulrich Sibiller science + computing ag System Administration Hagellocher Weg 73 72070 Tuebingen, Germany https://atos.net/de/deutschland/sc -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Thu Dec 13 20:54:39 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 13 Dec 2018 20:54:39 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? Message-ID: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Hi All, Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? Thanks all? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.knister at gmail.com Sat Dec 15 14:44:25 2018 From: aaron.knister at gmail.com (Aaron Knister) Date: Sat, 15 Dec 2018 09:44:25 -0500 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <9955EFB8-B50B-4F8C-8803-468626BE23FB@gmail.com> Hi Kevin, I don?t have any experience running GPFS over iSCSI (although does iSER count?). For what it?s worth there are, I believe, 2 vendors in the FC space? Cisco and Brocade. You can get a fully licensed 16GB Cisco MDS edge switch for not a whole lot (https://m.cdw.com/product/cisco-mds-9148s-switch-48-ports-managed-rack-mountable/3520640). If you start with fewer ports licensed the cost goes down dramatically. I?ve also found that, if it?s an option, IB is stupidly cheap from a dollar per unit of bandwidth perspective and makes a great SAN backend. One other, other thought is that FCoE seems very attractive. Not sure if your arrays support that but I believe you get closer performance and behavior to FC with FCoE than with iSCSI and I don?t think there?s a huge cost difference. It?s even more fun if you have multi-fabric FC switches that can do FC and FCoE because you can in theory bridge the two fabrics (e.g. use FCoE on your NSD servers to some 40G switches that support DCB and connect the 40G eth switches to an FC/FCoE switch and then address your 8Gb FC storage and FCoE storage using the same fabric). -Aaron Sent from my iPhone > On Dec 13, 2018, at 15:54, Buterbaugh, Kevin L wrote: > > Hi All, > > Googling ?GPFS and iSCSI? doesn?t produce a ton of hits! But we are interested to know if anyone is actually using GPFS over iSCSI? > > The reason why I?m asking is that we currently use an 8 Gb FC SAN ? QLogic SANbox 5800?s, QLogic HBA?s in our NSD servers ? but we?re seeing signs that, especially when we start using beefier storage arrays with more disks behind the controllers, the 8 Gb FC could be a bottleneck. > > As many / most of you are already aware, I?m sure, while 16 Gb FC exists, there?s basically only one vendor in that game. And guess what happens to prices when there?s only one vendor??? We bought our 8 Gb FC switches for approximately $5K apiece. List price on a 16 Gb FC switch - $40K. Ouch. > > So the idea of being able to use commodity 10 or 40 Gb Ethernet switches and HBA?s is very appealing ? both from a cost and a performance perspective (last I checked 40 Gb was more than twice 16 Gb!). Anybody doing this already? > > As those of you who?ve been on this list for a while and don?t filter out e-mails from me () already know, we have a much beefier Infortrend storage array we?ve purchased that I?m currently using to test various metadata configurations (and I will report back results on that when done, I promise). That array also supports iSCSI, so I actually have our test cluster GPFS filesystem up and running over iSCSI. It was surprisingly easy to set up. But any tips, suggestions, warnings, etc. about running GPFS over iSCSI are appreciated! > > Two things that I am already aware of are: 1) use jumbo frames, and 2) run iSCSI over it?s own private network. Other things I should be aware of?!? > > Thanks all? > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 16 11:59:39 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 16 Dec 2018 12:59:39 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Sun Dec 16 13:45:47 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Sun, 16 Dec 2018 14:45:47 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer : > Kevin, > > Ethernet networking of today is changing very fast as the driving forces > are the "Hyperscale" datacenters. This big innovation is changing the world > and is happening right now. You must understand the conversation by > breaking down the differences between ASICs, FPGAs, and NPUs in modern > Ethernet networking. > > 1) Mellanox has a very good answer here based on the Spectrum-2 chip > http://www.mellanox.com/page/press_release_item?id=1933 > > 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 > Ethernet Switch Series > > https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series > > https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html > > 3) Barefoots Tofinio2 is another valid answer to this problem as it's > programmable with the P4 language (important for Hyperscale Datacenters) > https://www.barefootnetworks.com/ > > The P4 language itself is open source. There?s details at p4.org, or you > can download code at GitHub: https://github.com/p4lang/ > > 4) The last newcomer to this party comes from Innovium named Teralynx > https://innovium.com/products/teralynx/ > > https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ > > (Most of the new Cisco switches are powered by the Teralynx silicon, as > Cisco seems to be late to this game with it's own development.) > > So back your question - iSCSI is not the future! NVMe and it's variants is > the way to go and these new ethernet swichting products does have this in > focus. > Due to the performance demands of NVMe, high performance and low latency > networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are > the leading choices. > > -frank- > > P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be > a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to > make use of all these new things and options :-) > > Frank Kraemer > IBM Consulting IT Specialist / Client Technical Architect > Am Weiher 24, 65451 Kelsterbach, Germany > mailto:kraemerf at de.ibm.com > Mobile +49171-3043699 > IBM Germany > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Sun Dec 16 17:25:13 2018 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Sun, 16 Dec 2018 17:25:13 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? - In-Reply-To: References: Message-ID: Using iSCSI with Spectrum Scale is definitely do-able. As with running Scale in general, your networking needs to be very solid. For iSCSI the best practice I?m aware of is the dedicated/simple approach described by JF below: one subnet per switch (failure domain), nothing fancy like VRRP/HSRP/STP, and let multipathd do its job at ensuring that the available paths are the ones being used. We have also had some good experiences using routed iSCSI (which fits the rackscale/hyperscale style deployment model too, but this implies that you have a good QoS plan to assure that markings are correct and any link which can become congested can?t completely starve the dedicated queue you should be using for iSCSI. It?s also a good practice for the other TCP traffic in your non-iSCSI queue to use ECN in order to keep switch buffer utilization low. (As of today, I haven?t seen any iSCSI arrays which support ECN.) If you?re sharing arrays with multiple clusters/filesystems (i.e. not a single workload), then I would also recommend using iSCSI arrays which support per-volume/volume-group QOS limits to avoid noisy-neighbor problems in the iSCSI realm. As of today, there are even 100GbE capable all-flash solutions available which work well with Scale. Lastly, I?d say that iSCSI might not be the future? but NVMeOF hasn?t exactly given us many products ready to be the present. Most of the early offerings in this space are under-featured, over-priced, inflexible, proprietary, or fragile. We are successfully using non-standards based NVMe solutions today with Scale, but they have much more stringent and sensitive networking requirements (e.g. non-routed dedicated networking with PFC for RoCE) in order to provide reliable performance. So far, we?ve found these early offerings best-suited for single-workload use cases. I do expect this to continue to develop and improve on price, features, reliability/fragility. Thx Paul From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Jan-Frode Myklebust Sent: Sunday, December 16, 2018 8:46 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Anybody running GPFS over iSCSI? - I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like for FC), otherwise iSCSI or GPFS is going to look bad when your network admins cause problems on the shared network. -jf s?n. 16. des. 2018 kl. 12:59 skrev Frank Kraemer >: Kevin, Ethernet networking of today is changing very fast as the driving forces are the "Hyperscale" datacenters. This big innovation is changing the world and is happening right now. You must understand the conversation by breaking down the differences between ASICs, FPGAs, and NPUs in modern Ethernet networking. 1) Mellanox has a very good answer here based on the Spectrum-2 chip http://www.mellanox.com/page/press_release_item?id=1933 2) Broadcom's answer to this is the 12.8 Tb/s StrataXGS Tomahawk 3 Ethernet Switch Series https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series https://globenewswire.com/news-release/2018/10/24/1626188/0/en/Broadcom-Achieves-Mass-Production-on-Industry-Leading-12-8-Tbps-Tomahawk-3-Ethernet-Switch-Family.html 3) Barefoots Tofinio2 is another valid answer to this problem as it's programmable with the P4 language (important for Hyperscale Datacenters) https://www.barefootnetworks.com/ The P4 language itself is open source. There?s details at p4.org, or you can download code at GitHub: https://github.com/p4lang/ 4) The last newcomer to this party comes from Innovium named Teralynx https://innovium.com/products/teralynx/ https://innovium.com/2018/03/20/innovium-releases-industrys-most-advanced-switch-software-platform-for-high-performance-data-center-networking-2-2-2-2-2-2/ (Most of the new Cisco switches are powered by the Teralynx silicon, as Cisco seems to be late to this game with it's own development.) So back your question - iSCSI is not the future! NVMe and it's variants is the way to go and these new ethernet swichting products does have this in focus. Due to the performance demands of NVMe, high performance and low latency networking is required and Ethernet based RDMA ? RoCE, RoCEv2 or iWARP are the leading choices. -frank- P.S. My Xmas wishlist to the IBM Spectrum Scale development team would be a "2019 HighSpeed Ethernet Networking optimization for Spectrum Scale" to make use of all these new things and options :-) Frank Kraemer IBM Consulting IT Specialist / Client Technical Architect Am Weiher 24, 65451 Kelsterbach, Germany mailto:kraemerf at de.ibm.com Mobile +49171-3043699 IBM Germany _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 00:21:57 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 00:21:57 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: [SNIP] > > Two things that I am already aware of are: ?1) use jumbo frames, and 2) > run iSCSI over it?s own private network. ?Other things I should be aware > of?!? > Yes, don't do it. Really do not do it unless you have datacenter Ethernet switches and adapters. Those are the ones required for FCoE. Basically unless you have per channel pause on your Ethernet fabric then performance will at some point all go to shit. So what happens is your NSD makes a whole bunch of requests to read blocks off the storage array. Requests are small, response is not. The response can overwhelms the Ethernet channel at which point performance falls through the floor. Now you might be lucky not to see this, especially if you have say have 10Gbps links from the storage and 40Gbps links to the NSD servers, but you are taking a gamble. Also the more storage arrays you have the more likely you are to see the problem. To fix this you have two options. The first is datacenter Ethernet with per channel pause. This option is expensive, probably in the same ball park as fibre channel. At least it was last time I looked, though this was some time ago now. The second option is dedicated links between the storage array and the NSD server. That is the cable goes directly between the storage array and the NSD server with no switches involved. This option is a maintenance nightmare. At he site where I did this, we had to go option two because I need to make it work, We ended up ripping it all out are replacing with FC. Personally I would see what price you can get DSS storage for, or use SAS arrays. Note iSCSI can in theory work, it's just the issue with GPFS scattering stuff to the winds over multiple storage arrays so your ethernet channel gets swamped and standard ethernet pauses all the upstream traffic. The vast majority of iSCSI use cases don't see this effect. There is a reason that to run FC over ethernet they had to turn ethernet lossless. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From janfrode at tanso.net Mon Dec 17 07:50:01 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Mon, 17 Dec 2018 08:50:01 +0100 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: I?d be curious to hear if all these arguments against iSCSI shouldn?t also apply to NSD protocol over TCP/IP? -jf man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard < jonathan.buzzard at strath.ac.uk>: > On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > Two things that I am already aware of are: 1) use jumbo frames, and 2) > > run iSCSI over it?s own private network. Other things I should be aware > > of?!? > > > > Yes, don't do it. Really do not do it unless you have datacenter > Ethernet switches and adapters. Those are the ones required for FCoE. > Basically unless you have per channel pause on your Ethernet fabric then > performance will at some point all go to shit. > > So what happens is your NSD makes a whole bunch of requests to read > blocks off the storage array. Requests are small, response is not. The > response can overwhelms the Ethernet channel at which point performance > falls through the floor. Now you might be lucky not to see this, > especially if you have say have 10Gbps links from the storage and 40Gbps > links to the NSD servers, but you are taking a gamble. Also the more > storage arrays you have the more likely you are to see the problem. > > To fix this you have two options. The first is datacenter Ethernet with > per channel pause. This option is expensive, probably in the same ball > park as fibre channel. At least it was last time I looked, though this > was some time ago now. > > The second option is dedicated links between the storage array and the > NSD server. That is the cable goes directly between the storage array > and the NSD server with no switches involved. This option is a > maintenance nightmare. > > At he site where I did this, we had to go option two because I need to > make it work, We ended up ripping it all out are replacing with FC. > > Personally I would see what price you can get DSS storage for, or use > SAS arrays. > > Note iSCSI can in theory work, it's just the issue with GPFS scattering > stuff to the winds over multiple storage arrays so your ethernet channel > gets swamped and standard ethernet pauses all the upstream traffic. The > vast majority of iSCSI use cases don't see this effect. > > There is a reason that to run FC over ethernet they had to turn ethernet > lossless. > > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Mon Dec 17 08:50:54 2018 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Mon, 17 Dec 2018 08:50:54 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: , <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From u.sibiller at science-computing.de Mon Dec 17 10:22:26 2018 From: u.sibiller at science-computing.de (Ulrich Sibiller) Date: Mon, 17 Dec 2018 11:22:26 +0100 Subject: [gpfsug-discuss] Filesystem access issues via CES NFS In-Reply-To: References: <717f49aade0b439eb1b99fc620a21cac@maxiv.lu.se> <9456645b0a1f4b488b13874ea672b9b8@maxiv.lu.se> Message-ID: <57c7a736-0774-2701-febc-3cdc57a50d86@science-computing.de> On 14.12.2018 15:45, Jeno Cram wrote: > Are you using Extended attributes on the directories in question? No. What's the background of your question? Kind regards, Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 From S.J.Thompson at bham.ac.uk Mon Dec 17 15:07:39 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 15:07:39 +0000 Subject: [gpfsug-discuss] GPFS API Message-ID: Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Mon Dec 17 15:47:18 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Mon, 17 Dec 2018 10:47:18 -0500 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: Look in gps.h.... I think the comment for acl_buffer_len is clear enough! /* Mapping of buffer for gpfs_getacl, gpfs_putacl. */ typedef struct gpfs_opaque_acl { int acl_buffer_len; /* INPUT: Total size of buffer (including this field). OUTPUT: Actual size of the ACL information. */ unsigned short acl_version; /* INPUT: Set to zero. OUTPUT: Current version of the returned ACL. */ unsigned char acl_type; /* INPUT: Type of ACL: access (1) or default (2). */ char acl_var_data[1]; /* OUTPUT: Remainder of the ACL information. */ } gpfs_opaque_acl_t; From: Simon Thompson To: "gpfsug-discuss at spectrumscale.org" Date: 12/17/2018 10:13 AM Subject: [gpfsug-discuss] GPFS API Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi, This is all probably perfectly clear to someone with the GPFS source code but ? we?re looking at writing some code using the API documented at: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.1/com.ibm.spectrum.scale.v5r01.doc/pdf/scale_cpr.pdf Specifically gpfs_getacl() function, in the docs you pass acl, and the notes say ?Pointer to a buffer mapped by the structure gpfs_opaque_acl_t or gpfs_acl_t, depending on the value of flags. The first four bytes of the buffer must contain its total size.?. Reading the docs for gpfs_opaque_acl_t, this is a struct of which the first element is an int. Is this the same 4 bytes referred to as above containing the size, and is this the size of the struct, of of the acl_var_data entry? It strikes me is should probably be the length of acl_var_data, but it is not entirely clear? Thanks Simon _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:22:35 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:22:35 +0000 Subject: [gpfsug-discuss] Anybody running GPFS over iSCSI? In-Reply-To: References: <4C076792-7B70-425C-9AF9-A139FA577CBA@vanderbilt.edu> Message-ID: <7d438bada357989f56eeb6f347ba2ddb5661679a.camel@strath.ac.uk> On Mon, 2018-12-17 at 08:50 +0100, Jan-Frode Myklebust wrote: > I?d be curious to hear if all these arguments against iSCSI shouldn?t > also apply to NSD protocol over TCP/IP? > They don't is the simple answer. Either theoretically or practically. It won't necessarily be a problem for iSCSI either. It is the potential for wild over subscription of the links with the total naivety of the block based iSCSI protocol that is the issue. I suspect that with NSD traffic it is self limiting. That said looking about higher end switches seem to do DCE now, though FCoE seems to have gone nowhere. Anyway as I said if you want lower cost use SAS. Even if you don't want to do ESS/SSS you can architecture something very similar using SAS based arrays from your favourite vendor, and just skip the native RAID bit. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 16:46:32 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 16:46:32 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: Message-ID: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Mon Dec 17 17:35:31 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 17 Dec 2018 17:35:31 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> References: , <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: Indeed. Actually this was exactly what we were trying to work out. We'd set the buf size to 0 hoping it would tell us how much we need, but we kept getting EINVAL back - which the docs states is invalid path, but actually it can be invalid bufsize as well apparently (the header file comments are different again to the docs). Anyway, we're looking at patching mpifileutils to support GPFS ACLs to help with migration of between old and new file-systems. I was actually using the opaque call on the assumption that it would be a binary blob of data I could poke to the new file. (I was too scared to use the attr functions as that copies DMAPI info as well and I'm not sure I want to "copy" my ILM files without recall!). Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing something to do with inheritance maybe? Thanks Marc, Jonathan for the pointers! Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 16:46 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS API On Mon, 2018-12-17 at 10:47 -0500, Marc A Kaplan wrote: > Look in gps.h.... I think the comment for acl_buffer_len is clear > enough! > I guess everyone does not read header files by default looking for comments on the structure ;-) One thing to watch out for is to check the return from gpfs_getacl, and if you get an ENOSPC error then your buffer is not big enough to hold the ACL and the first four bytes are set to the size you need. SO you need to do something like the following to safely get the ACL. acl = malloc(1024); acl->acl_len = 1024; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); if ((err==-1) && (errno=ENOSPC)) { int acl_size = acl->acl_len; free(acl); acl = malloc(acl_size); acl->acl_len = acl_size; acl->acl_level = 0; acl->acl_version = 0; acl->acl_type = 0; acl->acl_nace = 0; err = gpfs_getacl(fpath, GPFS_GETACL_STRUCT, acl); } Noting that unless you have some monster ACL 1KB is going to be more than enough. It is less than 16 bytes per ACL. What is not clear is the following in the gpfs_acl struct. v4Level1_t v4Level1; /* when GPFS_ACL_LEVEL_V4FLAGS */ What's that about, because there is zero documentation on it. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jonathan.buzzard at strath.ac.uk Mon Dec 17 22:44:14 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 22:44:14 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Mon Dec 17 23:04:15 2018 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 17 Dec 2018 23:04:15 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> Message-ID: <6e34f47e-4fc0-9726-0c1a-38720f5c26db@strath.ac.uk> On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). Forgot to say performance wise calling with effectively a zero buffer size and then calling again with precisely the right buffer size is not sensible plan. Basically you end up having to do two API calls instead of one to save less than 1KB of RAM in 99.999% of cases. Let's face it any machine running GPFS is going to have GB of RAM. That said I am not sure even allocating 1KB of RAM is sensible either. One suspects allocating less than a whole page might have performance implications. What I do know from testing is that, two API calls when iterating over millions of files has a significant impact on run time over just allocating a bunch of memory up front on only making the one call. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Tue Dec 18 07:05:18 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 18 Dec 2018 07:05:18 +0000 Subject: [gpfsug-discuss] GPFS API In-Reply-To: <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> References: <88ce686c0257b4acd366c3603783f71ebd87bbc0.camel@strath.ac.uk> , <69e9c4fa-4bd6-de23-add9-a0572d6d093a@strath.ac.uk> Message-ID: No, it comes back as the first four bytes of the structure in the size field. So if we set it the data field to zero, then it still has space in the size filed to return the required size. That was sort of the point.... The size value needs to be the size of the struct rather than the data field within it. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 17 December 2018 22:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS API On 17/12/2018 17:35, Simon Thompson wrote: > Indeed. > > Actually this was exactly what we were trying to work out. We'd set > the buf size to 0 hoping it would tell us how much we need, but we > kept getting EINVAL back - which the docs states is invalid path, but > actually it can be invalid bufsize as well apparently (the header > file comments are different again to the docs). > Well duh, if you pass in a zero sized buffer, then there is no space to pass the size back because it's in the first four bytes of the returned buffer. Further it needs to be big enough to hold the main fields otherwise did you request POSIX or NFSv4 ACL's etc. > Anyway, we're looking at patching mpifileutils to support GPFS ACLs > to help wi th migration of between old and new file-systems. > > I was actually using the opaque call on the assumption that it would > be a binary blob of data I could poke to the new file. (I was too > scared to use the attr functions as that copies DMAPI info as well > and I'm not sure I want to "copy" my ILM files without recall!). > > Its not clear what DEFAULT and ACCESS ACLs are. I'm guessing > something to do with inheritance maybe? > A default ACL is the default ACL that would be given to the file. Access ACL's are all the other ones. That is pretty basic ACL stuff. I would really like some info on the v4Level1_t stuff, as I ma reluctant to release my mmsetfacl code until I do. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Kevin.Buterbaugh at Vanderbilt.Edu Mon Dec 17 22:01:41 2018 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 17 Dec 2018 22:01:41 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy Message-ID: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Dec 18 09:13:05 2018 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 18 Dec 2018 10:13:05 +0100 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: I don't think it's going to figure this out automatically between the two rules.. I believe you will need to do something like the below (untested, and definitely not perfect!!) rebalancing: define( weight_junkedness, (CASE /* Create 3 classes of files */ /* ReallyOldStuff */ WHEN ((access_age > 365) AND (KB_ALLOCATED > 3584)) THEN 1000 /* OldStuff */ WHEN ((access_age > 90) AND (KB_ALLOCATED > 3584)) THEN 100 /* everything else */ ELSE 0 END) ) RULE 'defineTiers' GROUP POOL 'TIERS' IS 'data' LIMIT(98) THEN 'capacity' LIMIT(98) THEN 'oc' RULE 'Rebalance' MIGRATE FROM POOL 'TIERS' TO POOL 'TIERS' WEIGHT(weight_junkedness) Based on /usr/lpp/mmfs/samples/ilm/mmpolicy-fileheat.sample. -jf On Tue, Dec 18, 2018 at 9:10 AM Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > Hi All, > > As those of you who suffered thru my talk at SC18 already know, we?re > really short on space on one of our GPFS filesystems as the output of mmdf > piped to grep pool shows: > > Disks in storage pool: system (Maximum disk size allowed is 24 TB) > (pool total) 4.318T 1.078T ( 25%) > 79.47G ( 2%) > Disks in storage pool: data (Maximum disk size allowed is 262 TB) > (pool total) 494.7T 38.15T ( 8%) > 4.136T ( 1%) > Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) > (pool total) 640.2T 14.56T ( 2%) > 716.4G ( 0%) > > The system pool is metadata only. The data pool is the default pool. The > capacity pool is where files with an atime (yes, atime) > 90 days get > migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs > of 8 TB drives, so roughly 58.2 TB usable space per NSD. > > We have the new storage we purchased, but that?s still being tested and > held in reserve for after the first of the year when we create a new GPFS 5 > formatted filesystem and start migrating everything to the new filesystem. > > In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB > drives and will be hooking it up to one of our existing storage arrays on > Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present > those to GPFS as NSDs. They will be about 88 TB usable space each (because > ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me > started on so-called ?4K? TV?s ? end rant). > > A very wise man who used to work at IBM but now hangs out with people in > red polos () once told me that it?s OK to mix NSDs of slightly > different sizes in the same pool, but you don?t want to put NSDs of vastly > different sizes in the same pool because the smaller ones will fill first > and then the larger ones will have to take all the I/O. I consider 58 TB > and 88 TB to be pretty significantly different and am therefore planning on > creating yet another pool called ?oc? (over capacity if a user asks, old > crap internally!) and migrating files with an atime greater than, say, 1 > year to that pool. But since ALL of the files in the capacity pool haven?t > even been looked at in at least 90 days already, does it really matter? > I.e. should I just add the NSDs to the capacity pool and be done with it? > > If it?s a good idea to create another pool, then I have a question about > mmapplypolicy and migrations. I believe I understand how things work, but > after spending over an hour looking at the documentation I cannot find > anything that explicitly confirms my understanding ? so if I have another > pool called oc that?s ~264 TB in size and I write a policy file that looks > like: > > define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) > > RULE 'ReallyOldStuff' > MIGRATE FROM POOL 'capacity' > TO POOL 'oc' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) > > RULE 'OldStuff' > MIGRATE FROM POOL 'data' > TO POOL 'capacity' > LIMIT(98) > SIZE(KB_ALLOCATED/NLINK) > WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) > > Keeping in mind that my capacity pool is already 98% full, is > mmapplypolicy smart enough to calculate how much space it?s going to free > up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able > to potentially also move a ton of stuff from the data pool to the capacity > pool via the 2nd rule with just one invocation of mmapplypolicy? That?s > what I expect that it will do. I?m hoping I don?t have to run the > mmapplypolicy twice ? the first to move stuff from capacity to oc and then > a second time for it to realize, oh, I?ve got a much of space free in the > capacity pool now. > > Thanks in advance... > > Kevin > > P.S. In case you?re scratching your head over the fact that we have files > that people haven?t even looked at for months and months (more than a year > in some cases) sitting out there ? we sell quota in 1 TB increments ? once > they?ve bought the quota, it?s theirs. As long as they?re paying us the > monthly fee if they want to keep files relating to research they did during > the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 > ?.then that?s their choice. We do not purge files. > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 08:49:42 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 08:49:42 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <83A6EEB0EC738F459A39439733AE8045267C22D4@MBX114.d.ethz.ch> Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Dec 19 09:29:08 2018 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 19 Dec 2018 09:29:08 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Message-ID: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 09:47:03 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 11:47:03 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk> Message-ID: Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 09:51:58 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 09:51:58 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:05:04 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:05:04 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch> Message-ID: Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Wed Dec 19 10:22:32 2018 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 19 Dec 2018 10:22:32 +0000 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Dec 19 10:35:48 2018 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 19 Dec 2018 12:35:48 +0200 Subject: [gpfsug-discuss] verbs status not working in 5.0.2 In-Reply-To: <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> References: <7E7FA160-345E-4580-8DFC-6E13AAACE9BD@bham.ac.uk>, <83A6EEB0EC738F459A39439733AE8045267C2330@MBX114.d.ethz.ch>, <83A6EEB0EC738F459A39439733AE8045267C2354@MBX114.d.ethz.ch> Message-ID: Hi, So, with all the usual disclaimers... mmfsadm saferdump verbs is not enough? or even mmfsadm saferdump verbs | grep VerbsRdmaStarted Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 12:22 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org I'd like just one line that says "RDMA ON" or "RMDA OFF" (as was reported more or less by mmfsadm). I can get info about RMDA using mmdiag, but is much more output to parse (e.g. by a nagios script or just a human eye). Ok, never mind, I understand your explanation and it is not definitely a big issue... it was, above all, a curiosity to understand if the command was modified to get the same behavior as before, but in a different way. Cheers, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 11:05 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Changed means it provides some functions/information in a different way. So, I guess the question is what information do you need? ( and "officially" why isn't mmdiag good enough - what is missing. As you probably know, mmfsadm might cause crashes and deadlock from time to time, this is why we're trying to provide "safe ways" to get the required information). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: "Dorigo Alvise (PSI)" To: gpfsug main discussion list Date: 19/12/2018 11:53 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Tomer, "changed" makes me suppose that it is still possible, but in a different way... am I correct ? if yes, what it is ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, December 19, 2018 10:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, Yes, as part of the RDMA enhancements in 5.0.X much of the hidden test commands were changed. And since mmfsadm is not externalized none of them is documented ( and the help messages are not consistent as well). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 19/12/2018 11:29 Subject: Re: [gpfsug-discuss] verbs status not working in 5.0.2 Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmm interesting ? # mmfsadm test verbs usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut } # mmfsadm test verbs status usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} mmfsadm test verbs config still works though (which includes RdmaStarted flag) Simon From: on behalf of "alvise.dorigo at psi.ch" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 19 December 2018 at 08:51 To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] verbs status not working in 5.0.2 Hi, in GPFS 5.0.2 I cannot run anymore "mmfsadm test verbs status": [root at sf-dss-1 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "4.2.3.7 ". Built on Feb 15 2018 at 11:38:38 Running 62 days 11 hours 24 minutes 35 secs, pid 7510 VERBS RDMA status: started [root at sf-export-2 ~]# mmdiag --version ; mmfsadm test verbs status === mmdiag: version === Current GPFS build: "5.0.2.1 ". Built on Oct 24 2018 at 21:23:46 Running 10 minutes 24 secs, pid 3570 usage: {udapl | verbs} { status | skipio | noskipio | dump | maxRpcsOut | maxReplysOut | maxRdmasOut | config | conn | conndetails | stats | resetstats | ibcntreset | ibcntr | ia | pz | psp | evd | lmr | break | qps | inject op cnt err | breakqperr | qperridx idx | breakidx idx} Is it a known problem or am I doing something wrong ? Thanks, Alvise_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Dec 19 19:45:24 2018 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 19 Dec 2018 14:45:24 -0500 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Regarding mixing different sized NSDs in the same pool... GPFS has gotten somewhat smarter about striping over the years and also offers some options about how blocks are allocated over NSDs.... And then there's mmrestripe and its several options/flavors... You probably do want to segregate NSDs into different pools if the performance varies significantly among them. SO old advice may not apply 100%. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Wed Dec 19 19:13:16 2018 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 19 Dec 2018 19:13:16 +0000 Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy In-Reply-To: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> References: <86D68EB8-34F5-4182-A55D-7DE5B6492D3B@vanderbilt.edu> Message-ID: Hadn?t seen a response, but here?s one thing that might make your decision easier on this question: ?But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it?? Does the performance matter for accessing files in this capacity pool? If not, then just add it in. If it does, then you?ll need to concern yourself with the performance you?ll get from the NSDs that still have free space to store new data once the smaller NSDs become full. If that?s enough then just add it in. Old data will still be spread across the current storage in the capacity pool, so you?ll get current read performance rates for that data. By creating a new pool, oc, and then migrating data that hasn?t been accessed in over 1 year to it from the capacity pool, you?re freeing up new space to store new data on the capacity pool. This seems to really only be a benefit if the performance of the capacity pool is a lot greater than the oc pool and your users need that performance to satisfy their application workloads. Of course moving data around on a regular basis also has an impact to overall performance during these operations too, but maybe there are times when the system is idle and these operations will not really cause any performance heartburn. I think Marc will have to answer your other question? ;o) Hope that helps! -Bryan From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Buterbaugh, Kevin L Sent: Monday, December 17, 2018 4:02 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy [EXTERNAL EMAIL] Hi All, As those of you who suffered thru my talk at SC18 already know, we?re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows: Disks in storage pool: system (Maximum disk size allowed is 24 TB) (pool total) 4.318T 1.078T ( 25%) 79.47G ( 2%) Disks in storage pool: data (Maximum disk size allowed is 262 TB) (pool total) 494.7T 38.15T ( 8%) 4.136T ( 1%) Disks in storage pool: capacity (Maximum disk size allowed is 519 TB) (pool total) 640.2T 14.56T ( 2%) 716.4G ( 0%) The system pool is metadata only. The data pool is the default pool. The capacity pool is where files with an atime (yes, atime) > 90 days get migrated. The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD. We have the new storage we purchased, but that?s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem. In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday. My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs. They will be about 88 TB usable space each (because ? beginning rant ? a 12 TB drive is < 11 TB is size ? and don?t get me started on so-called ?4K? TV?s ? end rant). A very wise man who used to work at IBM but now hangs out with people in red polos () once told me that it?s OK to mix NSDs of slightly different sizes in the same pool, but you don?t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O. I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called ?oc? (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool. But since ALL of the files in the capacity pool haven?t even been looked at in at least 90 days already, does it really matter? I.e. should I just add the NSDs to the capacity pool and be done with it? If it?s a good idea to create another pool, then I have a question about mmapplypolicy and migrations. I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding ? so if I have another pool called oc that?s ~264 TB in size and I write a policy file that looks like: define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) RULE 'ReallyOldStuff' MIGRATE FROM POOL 'capacity' TO POOL 'oc' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584)) RULE 'OldStuff' MIGRATE FROM POOL 'data' TO POOL 'capacity' LIMIT(98) SIZE(KB_ALLOCATED/NLINK) WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584)) Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it?s going to free up in the capacity pool by the ?ReallyOldStuff? rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy? That?s what I expect that it will do. I?m hoping I don?t have to run the mmapplypolicy twice ? the first to move stuff from capacity to oc and then a second time for it to realize, oh, I?ve got a much of space free in the capacity pool now. Thanks in advance... Kevin P.S. In case you?re scratching your head over the fact that we have files that people haven?t even looked at for months and months (more than a year in some cases) sitting out there ? we sell quota in 1 TB increments ? once they?ve bought the quota, it?s theirs. As long as they?re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there ? and I mean Bush 41, not Bush 43 ?.then that?s their choice. We do not purge files. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potential ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company?s treatment of personal data, please email datarequests at jumptrading.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemerf at de.ibm.com Sun Dec 23 13:29:42 2018 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Sun, 23 Dec 2018 14:29:42 +0100 Subject: [gpfsug-discuss] [Announcement] IBM Storage Enabler for Containers v2.0 has been released for general availability Message-ID: We're happy to announce the release of IBM Storage Enabler for Containers v2.0for general availability on Fix Central. IBM Storage Enabler for Containers v2.0extends IBM support for Kubernetes and IBM Cloud Private orchestrated container environments by supporting IBM Spectrum Scale(formerly IBM GPFS). IBM Storage Enabler for Containers v2.0introduces the following new functionalities IBM Spectrum Scale v5.0+support Support orchestration platforms IBM Cloud Private (ICP) v3.1.1 and Kubernetes v1.12 Support mixed deployment of Fibre Channel and iSCSI in the same cluster Kubernetes Service Accounts for more effective pod authorization procedure Required Components IBM Spectrum Connect v3.6 Installer for IBM Storage Enabler for Containers Fix Central publication link http://www.ibm.com/support/fixcentral/swg/quickorder?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM +Storage+Enabler+for +Containers&release=All&platform=All&function=all&source=fc Cheers,Tal Mailto: talsha at il.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From roblogie at au1.ibm.com Thu Dec 27 22:38:29 2018 From: roblogie at au1.ibm.com (Rob Logie) Date: Thu, 27 Dec 2018 22:38:29 +0000 Subject: [gpfsug-discuss] Introduction Message-ID: Hi I am a new member to the list. I am an IBMer (GBS) based in Ballarat Australia. Part of my role is supporting a small Spectrum Scale (GPFS) cluster on behalf of an IBM customer in Australia. Cluster is hosted on Redhat AWS EC2 instances and is accessed via CES SMB shares . Cheers (And happy Holidays !) Regards, Rob Logie IT Specialist IBM A/NZ GBS -------------- next part -------------- An HTML attachment was scrubbed... URL: