From p.childs at qmul.ac.uk Tue May 10 20:17:41 2022 From: p.childs at qmul.ac.uk (Peter Childs) Date: Tue, 10 May 2022 19:17:41 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths Message-ID: We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details From kevindjo at us.ibm.com Tue May 10 20:28:08 2022 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 10 May 2022 19:28:08 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 20:57:13 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 19:57:13 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 21:01:08 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 20:01:08 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: Knew I had a better one around here somewhere: mmlsrecoverygroup -L --pdisk | grep "1, 2" Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 3:57 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 21:10:36 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 20:10:36 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: So just look at your good paths and build a grep -v that matches what you need: This is what we see. so grep -v "2, 4" n. active, declustered state, pdisk total paths array free space remarks ----------------- ----------- ----------- ---------- ------- e1s001 2, 4 DA1 288 GiB ok e1s002 2, 4 DA1 304 GiB ok e1s003 2, 4 DA1 296 GiB ok e1s004 2, 4 DA1 288 GiB ok e1s005 2, 4 DA1 288 GiB ok e1s006 2, 4 DA1 288 GiB ok Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 4:01 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths Knew I had a better one around here somewhere: mmlsrecoverygroup -L --pdisk | grep "1, 2" Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 3:57 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Tue May 10 23:08:38 2022 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Tue, 10 May 2022 23:08:38 +0100 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: On 10/05/2022 20:57, Wahl, Edward wrote: > There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. > ?Or something like "mmgetpdisktopology ?> /tmp/mmgetpdisktopology.out; > topsummary /tmp/mmgetpdisktopology.out ? > topsummary" > > mmlspdisk should all 4 paths with 2 being notenabled.? Look for missing > paths?? This may be the fastest. > " device = > "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" > " On a DSS-G you can hop onto one of the servers and do lsscsi | grep disk | grep -v RAID | awk 'END {print NR/2}' The number should be the number of disks that you have including the log tip drives. If it's not then there is a path issue. I would imagine it's the same on the ESS. If I thought about it for a bit you could ditch the two greps and do it all in Awk. If the Awk is too complicated just replace it with 'wc -l' but you will end up with a number that should be twice the number of drives. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Alec.Effrat at wellsfargo.com Tue May 10 23:18:18 2022 From: Alec.Effrat at wellsfargo.com (Alec.Effrat at wellsfargo.com) Date: Tue, 10 May 2022 22:18:18 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References:

Message-ID: Fine.. challenge accepted awk '/disk/&&!/RAID/{print; l++} END {printf "%0.2f\n", l/2}' $ ( echo "disk123"; echo disk234; echo disk897; echo "RAID") | awk '/disk/&&!/RAID/{print; l++} END {printf "%0.2f\n",l/2}' disk123 disk234 disk897 1.50 -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, May 10, 2022 3:09 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] ESS and Disk Paths On 10/05/2022 20:57, Wahl, Edward wrote: > There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. > Or something like "mmgetpdisktopology > > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" > > mmlspdisk should all 4 paths with 2 being notenabled. Look for > missing paths? This may be the fastest. > " device = > "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" > " On a DSS-G you can hop onto one of the servers and do lsscsi | grep disk | grep -v RAID | awk 'END {print NR/2}' The number should be the number of disks that you have including the log tip drives. If it's not then there is a path issue. I would imagine it's the same on the ESS. If I thought about it for a bit you could ditch the two greps and do it all in Awk. If the Awk is too complicated just replace it with 'wc -l' but you will end up with a number that should be twice the number of drives. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!F9svGWnIaVPGSwU!sl4vQf2qHT1fXJuzhho-JRymyZj12LRJ9FWZDM8-GQevtCoEp6PLOLnqSa5n94u0nmRXL0WEncaSGiYck_o6XaO3xk5vf4OpaJ3ChQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.childs at qmul.ac.uk Wed May 11 08:30:51 2022 From: p.childs at qmul.ac.uk (Peter Childs) Date: Wed, 11 May 2022 07:30:51 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Re: ESS and Disk Paths In-Reply-To: References:

Message-ID: mmlsrecoverygroup ess5k_78AF89A -L --pdisk -Y | awk -F: '{if ($2=="pdisk" && $3!="HEADER" && $14!="4" && $9=="DA1" && $12=="ok" && $13=="normal"){print $0 }}' Is as far as I got from last nights posts.... Which returns "nothing" on success and a list of disks with missing paths on the ESS. Command probably needs improving and adding to mmhealth ;) but I'm to used to checking the multipath output on disk trays so this looks like a very similar issue. Peter Childs ITS Research Storage Phone: 020 7882 8393 or on Teams Please contact Research Support via Ticket by email:its-research-support at qmul.ac.uk or https://support.research.its.qmul.ac.uk/ Please check the Research blog at https://blog.hpc.qmul.ac.uk ________________________________________ From: gpfsug-discuss on behalf of Alec.Effrat at wellsfargo.com Sent: Tuesday, May 10, 2022 11:18 PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] Re: [gpfsug-discuss] ESS and Disk Paths CAUTION: This email originated from outside of QMUL. Do not click links or open attachments unless you recognise the sender and know the content is safe. Fine.. challenge accepted awk '/disk/&&!/RAID/{print; l++} END {printf "%0.2f\n", l/2}' $ ( echo "disk123"; echo disk234; echo disk897; echo "RAID") | awk '/disk/&&!/RAID/{print; l++} END {printf "%0.2f\n",l/2}' disk123 disk234 disk897 1.50 -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, May 10, 2022 3:09 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] ESS and Disk Paths On 10/05/2022 20:57, Wahl, Edward wrote: > There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. > Or something like "mmgetpdisktopology > > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" > > mmlspdisk should all 4 paths with 2 being notenabled. Look for > missing paths? This may be the fastest. > " device = > "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" > " On a DSS-G you can hop onto one of the servers and do lsscsi | grep disk | grep -v RAID | awk 'END {print NR/2}' The number should be the number of disks that you have including the log tip drives. If it's not then there is a path issue. I would imagine it's the same on the ESS. If I thought about it for a bit you could ditch the two greps and do it all in Awk. If the Awk is too complicated just replace it with 'wc -l' but you will end up with a number that should be twice the number of drives. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!F9svGWnIaVPGSwU!sl4vQf2qHT1fXJuzhho-JRymyZj12LRJ9FWZDM8-GQevtCoEp6PLOLnqSa5n94u0nmRXL0WEncaSGiYck_o6XaO3xk5vf4OpaJ3ChQ$ From christian.petersson at isstech.io Wed May 11 08:33:13 2022 From: christian.petersson at isstech.io (Christian Petersson) Date: Wed, 11 May 2022 09:33:13 +0200 Subject: [gpfsug-discuss] TCT Premigration Question Message-ID: Hi, I have set up a new Spectrum Scale Cluster where we are using TCT Premigration to a Ceph S3 Cluster, the premigration itself works fine and I got data to our S3 storage. But from time to time I reach our S3 max sessions and get the following errors in our policy. <1> MCSTG000130E: Command Failed with following reason: Request failed with error : com.ibm.gpfsconnector.messages.GpfsConnectorException: Unable to migrate data to the cloud, an unexpected exception occurred : com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.; If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int). But I wonder, if a file get skipt, will Spectrum Scale Automatic retry that file again later or will that be skipped and how can I catch does files for a new migration? -- Med V?nliga H?lsningar Christian Petersson E-Post: Christian.Petersson at isstech.io Mobil: 070-3251577 -------------- next part -------------- An HTML attachment was scrubbed... URL: From TROPPENS at de.ibm.com Wed May 11 14:27:16 2022 From: TROPPENS at de.ibm.com (Ulf Troppens) Date: Wed, 11 May 2022 13:27:16 +0000 Subject: [gpfsug-discuss] Spectrum Scale User Meeting at ISC 2022 - May 30th, 2022 - Hamburg, Germany Message-ID: IBM is organizing a Spectrum Scale User Meeting at ISC 2022. We have an exciting agenda covering user stories, roadmap update, the latest insights into data fabrics, data orchestration and data management architectures, plus access to IBM experts and your peers. We look forward to welcoming you to this event. NDA's have to be signed as roadmap items are discussed. Key topics covered: - What is new in Spectrum Scale and ESS? - GPUDirect Storage in Spectrum Scale - Performance Update - Not enough money for an all flash HPC storage? A brief cheating guide. (Talk by SVA) - New S3 access for AI and Analytics workloads - Data orchestration across the global data platform As always we are looking for more customer talks. Please send me a mail, if you are a customer and interested to share your experience. Detailed agenda: https://www.spectrumscaleug.org/event/spectrum-scale-user-meeting-at-isc-2022/ Date: Mon, 30. May 2022 - 14:00-17:00 CET Local time | Hall X Room 8 by SVA Congress Center Hamburg (CCH) Any and all people entering meeting rooms MUST be registered with an ISC 2022 exhibition pass or conference pass. Please note that registration is only available online and in advance. On-site registration is NOT possible. Ulf Troppens Senior Technical Staff Member Spectrum Scale Development IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen / Gesch?ftsf?hrung: David Faller Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu May 12 13:34:56 2022 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 12 May 2022 08:34:56 -0400 Subject: [gpfsug-discuss] TCT Premigration Question In-Reply-To: References: Message-ID: Some thoughts to consider regarding your question. 1 If this is ?new? deployment, it is recommended that you evaluate the AFM object support since the TCT feature is stabilized and no future changes/enhancements will be made to it. 2 TCT doesn?t support Ceph cloud object storage. So, though it is an S3 compliant interface, and things are working, in case of any issues in future, IBM would require that you reproduce any problemin a supported environment. 3 If a file is skipped for any errors, if it is policy driven (pre)migration, the next policy run would pick up the file again. Even within a single (pre)migrate, TCT does 5 retries, on failure. So at both levels (within TCT as well as at policy level) file will be retried for migration. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Christian Petersson" To: gpfsug-discuss at gpfsug.org Date: 05/11/2022 03:34 AM Subject: [EXTERNAL] [gpfsug-discuss] TCT Premigration Question Sent by: "gpfsug-discuss" Hi, I have set up a new Spectrum Scale Cluster where we are using TCT Premigration to a Ceph S3 Cluster, the premigration itself works fine and I got data to our S3 storage. But from time to time I reach our S3 max sessions and get the following ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi, I have set up a new Spectrum Scale Cluster where we are using TCT Premigration to a Ceph S3 Cluster, the premigration itself works fine and I got data to our S3 storage. But from time to time I reach our S3 max sessions and get the following errors in our policy. <1> MCSTG000130E: Command Failed with following reason: Request failed with error : com.ibm.gpfsconnector.messages.GpfsConnectorException: Unable to migrate data to the cloud, an unexpected exception occurred : com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.; If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int). But I wonder, if a file get skipt, will Spectrum Scale Automatic retry that file again later or will that be skipped and how can I catch does files for a new migration? -- Med V?nliga H?lsningar Christian Petersson E-Post: Christian.Petersson at isstech.io Mobil: 070-3251577 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.petersson at isstech.io Thu May 12 13:47:20 2022 From: christian.petersson at isstech.io (Christian Petersson) Date: Thu, 12 May 2022 14:47:20 +0200 Subject: [gpfsug-discuss] TCT Premigration Question In-Reply-To: References:

Message-ID: Thanks for the answer, AFM to Cloud Object Storage did we look at but had some thoughts about. The base design was to replicate a copy from our primary cluster to the backup cluster where we push that data directly to S3. Our second cluster has much less disk space than our primary cluster and our primary cluster has a HSM function enabled to a tape library. If I start using AFM to Object, will that delete the file from my disk and only keep a stub file or similar on the disk? This is why we are looking at TCT instead of AFM Object, we don't have enough disk space to keep multiple copies of the data on disk, we need to use other storage layers like S3 and Tape. /C On Thu, 12 May 2022 at 14:38, IBM Spectrum Scale wrote: > Some thoughts to consider regarding your question. > > 1 If this is ?new? deployment, it is recommended that you > evaluate the AFM object support since the TCT feature is stabilized and no > future changes/enhancements will be made to it. > 2 TCT doesn?t support Ceph cloud object storage. So, though > it is an S3 compliant interface, and things are working, in case of any > issues in future, IBM would require that you reproduce any problemin a > supported environment. > 3 If a file is skipped for any errors, if it is policy > driven (pre)migration, the next policy run would pick up the file again. > Even within a single (pre)migrate, TCT does 5 retries, on failure. So at > both levels (within TCT as well as at policy level) file will be retried > for migration. > > > Regards, The Spectrum Scale (GPFS) team > > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > > > From: "Christian Petersson" > To: gpfsug-discuss at gpfsug.org > Date: 05/11/2022 03:34 AM > Subject: [EXTERNAL] [gpfsug-discuss] TCT Premigration Question > Sent by: "gpfsug-discuss" > ------------------------------ > > > > Hi, I have set up a new Spectrum Scale Cluster where we are using TCT > Premigration to a Ceph S3 Cluster, the premigration itself works fine and I > got data to our S3 storage. But from time to time I reach our S3 max > sessions and get the following > ZjQcmQRYFpfptBannerStart > *This Message Is From an External Sender * > > This message came from outside your organization. > > > ZjQcmQRYFpfptBannerEnd > Hi, > I have set up a new Spectrum Scale Cluster where we are using TCT > Premigration to a Ceph S3 Cluster, the premigration itself works fine and I > got data to our S3 storage. > But from time to time I reach our S3 max sessions and get the following > errors in our policy. > > <1> MCSTG000130E: Command Failed with following reason: Request failed > with error : com.ibm.gpfsconnector.messages.GpfsConnectorException: Unable > to migrate data to the cloud, an unexpected exception occurred : > com.amazonaws.ResetException: The request to the service failed with a > retryable reason, but resetting the request input stream has failed. See > exception.getExtraInfo or debug-level logging for the original failure that > caused this retry.; If the request involves an input stream, the maximum > stream buffer size can be configured via > request.getRequestClientOptions().setReadLimit(int). > > But I wonder, if a file get skipt, will Spectrum Scale Automatic retry > that file again later or will that be skipped and how can I catch does > files for a new migration? > > -- > Med V?nliga H?lsningar > Christian Petersson > > E-Post: *Christian.Petersson at isstech.io* > Mobil: 070-3251577 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > > > -- Med V?nliga H?lsningar Christian Petersson E-Post: Christian.Petersson at isstech.io Mobil: 070-3251577 -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.chase at metoffice.gov.uk Thu May 12 16:37:39 2022 From: peter.chase at metoffice.gov.uk (Chase, Peter) Date: Thu, 12 May 2022 15:37:39 +0000 Subject: [gpfsug-discuss] RHEL7, Scale 5, GUI CSR Message-ID: Afternoon all, I've hit a problem with our Spectrum Scale GUI. It's on 5.1.3.1, and running OS is RHEL 7. When I try to generate a CSR in the GUI with a Subject Alternative Name set I get the following error: Error: EFSSG1159A Openssl failed to run the specified command. At least openssl version 1.1.1 is needed on the GUI node to use Subject Alternative Names. Please upgrade it and try again. I've installed the latest OpenSSL11 RPM from EPEL, but I'm not able to get the Scale GUI to make use of it. RHEL 7 runs the GUI as a systemd service. The user in the unit file is scalemgmt. The environment variables get set from a EnvironmentFile directive pointing at /etc/sysconfig/gpfsgui. I've updated the PATH variable set in /etc/sysconfig/gpfsgui to have /home/scalemgmt/bin before /bin, and added a symlink to openssl11 as openssl in that directory. I've confirmed that it works by running interactivity with the user scalemgmt, sourcing the file and running openssl , which results in the 1.1.1 version opening. But that hasn't worked for the GUI, and I get the same error. So I've explicitly sourced the file in scalemgmt's .bash_profile, and that's made no difference either. I've trapped for the command running with ps and can see it really is running as the scalemgmt user, and GPFS isn't doing some magic to run as a different user. As a last ditch, I've created a /usr/lib/systemd/system/gpfsgui.service.d directory, and added a path.conf file to specify the PATH as I want it with an environment directive, and that has also made no difference. So, I'm wondering if anyone else has experienced such a problem, and how they managed to get it working (short of generating a CSR with SSL command line and working out which java keystore to stuff it into). Sincerely, Peter Chase -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Wed May 18 17:36:13 2022 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Wed, 18 May 2022 16:36:13 +0000 Subject: [gpfsug-discuss] TCT Premigration Question In-Reply-To: References:

Message-ID: Christian, Like any AFM deployment in Spectrum Scale, you define how much local working space you want to allocate in the Spectrum Scale filesystem and files will be cached on demand into this 'working set' and will never exceed this upper bound. If you create a new file on this filesystem (in this fileset) then it will automatically be queued to be replicated onto the backend. If you make continuous changes to a file then setting afmAsyncDelay can reduce the number of writes to the backend object store. The local copy though will be kept and only stubbed as space runs out (or some other explicit rule applied such as via mmafmctl evict). Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com [cid:25007e89-6cb9-44a6-b342-34415f14c0ed] ________________________________ From: gpfsug-discuss on behalf of Christian Petersson Sent: 12 May 2022 13:47 To: IBM Spectrum Scale Cc: gpfsug main discussion list Subject: Re: [gpfsug-discuss] TCT Premigration Question Thanks for the answer, AFM to Cloud Object Storage did we look at but had some thoughts about. The base design was to replicate a copy from our primary cluster to the backup cluster where we push that data directly to S3. Our second cluster has much less disk space than our primary cluster and our primary cluster has a HSM function enabled to a tape library. If I start using AFM to Object, will that delete the file from my disk and only keep a stub file or similar on the disk? This is why we are looking at TCT instead of AFM Object, we don't have enough disk space to keep multiple copies of the data on disk, we need to use other storage layers like S3 and Tape. /C On Thu, 12 May 2022 at 14:38, IBM Spectrum Scale > wrote: Some thoughts to consider regarding your question. 1 If this is ?new? deployment, it is recommended that you evaluate the AFM object support since the TCT feature is stabilized and no future changes/enhancements will be made to it. 2 TCT doesn?t support Ceph cloud object storage. So, though it is an S3 compliant interface, and things are working, in case of any issues in future, IBM would require that you reproduce any problemin a supported environment. 3 If a file is skipped for any errors, if it is policy driven (pre)migration, the next policy run would pick up the file again. Even within a single (pre)migrate, TCT does 5 retries, on failure. So at both levels (within TCT as well as at policy level) file will be retried for migration. Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Christian Petersson" > To: gpfsug-discuss at gpfsug.org Date: 05/11/2022 03:34 AM Subject: [EXTERNAL] [gpfsug-discuss] TCT Premigration Question Sent by: "gpfsug-discuss" > ________________________________ Hi, I have set up a new Spectrum Scale Cluster where we are using TCT Premigration to a Ceph S3 Cluster, the premigration itself works fine and I got data to our S3 storage. But from time to time I reach our S3 max sessions and get the following Hi, I have set up a new Spectrum Scale Cluster where we are using TCT Premigration to a Ceph S3 Cluster, the premigration itself works fine and I got data to our S3 storage. But from time to time I reach our S3 max sessions and get the following errors in our policy. <1> MCSTG000130E: Command Failed with following reason: Request failed with error : com.ibm.gpfsconnector.messages.GpfsConnectorException: Unable to migrate data to the cloud, an unexpected exception occurred : com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.; If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int). But I wonder, if a file get skipt, will Spectrum Scale Automatic retry that file again later or will that be skipped and how can I catch does files for a new migration? -- Med V?nliga H?lsningar Christian Petersson E-Post: Christian.Petersson at isstech.io Mobil: 070-3251577 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -- Med V?nliga H?lsningar Christian Petersson E-Post: Christian.Petersson at isstech.io Mobil: 070-3251577 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-lrfyeiuo.png Type: image/png Size: 2541 bytes Desc: Outlook-lrfyeiuo.png URL: From p.childs at qmul.ac.uk Tue May 10 20:17:41 2022 From: p.childs at qmul.ac.uk (Peter Childs) Date: Tue, 10 May 2022 19:17:41 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths Message-ID: We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details From kevindjo at us.ibm.com Tue May 10 20:28:08 2022 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 10 May 2022 19:28:08 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 20:57:13 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 19:57:13 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 21:01:08 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 20:01:08 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: Knew I had a better one around here somewhere: mmlsrecoverygroup -L --pdisk | grep "1, 2" Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 3:57 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue May 10 21:10:36 2022 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 10 May 2022 20:10:36 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: So just look at your good paths and build a grep -v that matches what you need: This is what we see. so grep -v "2, 4" n. active, declustered state, pdisk total paths array free space remarks ----------------- ----------- ----------- ---------- ------- e1s001 2, 4 DA1 288 GiB ok e1s002 2, 4 DA1 304 GiB ok e1s003 2, 4 DA1 296 GiB ok e1s004 2, 4 DA1 288 GiB ok e1s005 2, 4 DA1 288 GiB ok e1s006 2, 4 DA1 288 GiB ok Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 4:01 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths Knew I had a better one around here somewhere: mmlsrecoverygroup -L --pdisk | grep "1, 2" Ed Wahl OSC ________________________________ From: Wahl, Edward Sent: Tuesday, May 10, 2022 3:57 PM To: gpfsug main discussion list Subject: Re: ESS and Disk Paths There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. Or something like "mmgetpdisktopology > /tmp/mmgetpdisktopology.out; topsummary /tmp/mmgetpdisktopology.out > topsummary" mmlspdisk should all 4 paths with 2 being notenabled. Look for missing paths? This may be the fastest. " device = "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" " If you are using IBM ESS hardware you can make some guesses and then use the Seagate tools to look and see if each SG device has drives. "for i in `lsscsi -g -v | grep encl | grep 5147 | awk '{print $7 }'`; do echo "*** $i ***"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli $i ddump_hid | egrep "FLASH" -B8; echo; done" We've got 5147-106 drive enclosures so I can check each disk... " i=0; while [ $i -lt 106 ]; do echo "+++ $i +++"; /usr/lpp/mmfs/updates/latest/firmware/enclosure/wbcli /dev/sg131 "getdrivestatus $i" | grep "LED : ON"; sleep 1; let i=i+1; done " Lots of options. Now that said, many of these are slower than the gnrhealthcheck for me. Your milage may vary. Ed Wahl OSC ________________________________ From: gpfsug-discuss on behalf of Kevin D Johnson Sent: Tuesday, May 10, 2022 3:28 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] ESS and Disk Paths https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? https://www.ibm.com/docs/en/ess/6.0.2?topic=commands-essfindmissingdisks-command Kevin D. Johnson Senior Managing Consultant MBA, MAcc, MS Global Technology and Development kevindjo at us.ibm.com 720-349-6199 office https://ibm.webex.com/meet/kevindjo ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Tuesday, May 10, 2022 3:17 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ESS and Disk Paths We have a nearly new ESS5000 (we're still attempting to commission it...) Anyway, does anyone know a quick way to check if all the paths to the trays are working? We've got 5 trays and one of the Servers can only see one of the 5 trays via one path, and its not causing an alert until we run a full "gnrhealthcheck" which we only found when attempting to upgrade it. Looking closer in the gui all the disks in that tray are recording 3 paths not 4 but its not causing any health alerts. Don't worry we've got a ticket open and they are attempting to replace the cable, I'm just attempting to find a nice quick way to detect the problem and know when its resolved rather than waiting 5 minutes while the gnrhealthcheck runs. Peter Childs We are recruiting please see https://www.qmul.ac.uk/jobs/vacancies/items/6949.html for more details _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Tue May 10 23:08:38 2022 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Tue, 10 May 2022 23:08:38 +0100 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: Message-ID: On 10/05/2022 20:57, Wahl, Edward wrote: > There is much you can do with 'lsscsi' and "mmlspdisk" and whatnot. > ?Or something like "mmgetpdisktopology ?> /tmp/mmgetpdisktopology.out; > topsummary /tmp/mmgetpdisktopology.out ? > topsummary" > > mmlspdisk should all 4 paths with 2 being notenabled.? Look for missing > paths?? This may be the fastest. > " device = > "//ibmgssio5-hs.ten/dev/sdo(notEnabled),//ibmgssio5-hs.ten/dev/sdfe(notEnabled),//ibmgssio6-hs.ten/dev/sdo,//ibmgssio6-hs.ten/dev/sdfe" > " On a DSS-G you can hop onto one of the servers and do lsscsi | grep disk | grep -v RAID | awk 'END {print NR/2}' The number should be the number of disks that you have including the log tip drives. If it's not then there is a path issue. I would imagine it's the same on the ESS. If I thought about it for a bit you could ditch the two greps and do it all in Awk. If the Awk is too complicated just replace it with 'wc -l' but you will end up with a number that should be twice the number of drives. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Alec.Effrat at wellsfargo.com Tue May 10 23:18:18 2022 From: Alec.Effrat at wellsfargo.com (Alec.Effrat at wellsfargo.com) Date: Tue, 10 May 2022 22:18:18 +0000 Subject: [gpfsug-discuss] ESS and Disk Paths In-Reply-To: References: