From chair at gpfsug.org Mon Jun 8 10:06:56 2015 From: chair at gpfsug.org (Jez Tucker) Date: Mon, 08 Jun 2015 10:06:56 +0100 Subject: [gpfsug-discuss] Election news Message-ID: <55755B30.9070601@gpfsug.org> Hello all Simon Thompson (Research Computing, University of Birmingham) has put himself forward as sole candidate for position of Chair. I firmly believe it in the best interest of the Group that we do not have the same Chair indefinitely. The Group is on fine footing, so now is an appropriate time for change. Having spoken with Simon, the UG Committee are more than happy to recommend him for the position of Chair for the next two years. Over the same period, the UG Committee has proposed that I move to represent Media (as per sector representatives) and continue to support the efforts of the Group where appropriate. The UG Committee would also like to recommend Ross Keeping for the IBM non-exec position. Some of you will have met Ross at the recent User Group. He understands the focus and needs of the Group and will act as the group's plug-in to IBM as well as hosting the 'Meet the Devs' events (details on the next one soon). With respect to the above, we do not believe it is prudent to spend time and resource on election scaffolding to vote for a single candidate. We would suggest that if the majority of members are extremely against this move that it is discussed openly in the mailing list. Discussion is good! Failing any overwhelming response to the contrary, Simon will assume position of Chair on 19th June 2015 with the Committee?s full support. Best regards, Jez (Chair) and Claire (Secretary) -------- Simon's response to the Election call follows verbatim: a) The post they wish to stand for Group chair b) A paragraph covering their credentials I have been working with GPFS for the past few years, initially in an HPC environment and more recently using it to deliver our research data and OpenStack platforms. The research storage platform was developed in conjunction with OCF, our IBM business partner which spans both spinning disk and TSM HSM layer. I have spoken at both the UK GPFS user group and at the GPFS user forum in the USA. In addition to this I've made a short customer video used by IBM marketing. Linked in profile: uk.linkedin.com/in/simonjthompson1 Blog: www.roamingzebra.co.uk c) A paragraph covering what they would bring to the group I already have a good working relationship with GPFS developers having spent the past few months building our OpenStack platform working with IBM and documenting how to use some of the features, and would look to build on this relationship to develop the GPFS user group. I've also blogged many of the bits I have experimented with and would like to see this develop with the group contributing to a wiki style information source with specific examples of technology and configs. In addition to this, I have support from my employer to attend meetings and conferences and would be happy to represent and promote the group at these as well as bringing feedback. d) A paragraph setting out their vision for the group for the next two years I would like to see the group engaging with more diverse users of GPFS as many of those attending the meetings are from HPC type environments, so I would loom to work with both IBM and resellers to help engage with other industries using GPFS technology. Ideally this would see more customer talks at the user group in addition to a balanced view from IBM on road maps. I think it would also be good to focus on specific features of gpfs and how they work and can be applied as my suspicion is very few customers use lots of features to full advantage. Simon -------- ends -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Mon Jun 15 09:35:18 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Mon, 15 Jun 2015 08:35:18 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 15 14:24:57 2015 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Mon, 15 Jun 2015 14:24:57 +0100 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Message-ID: <557ED229.50902@gpfsug.org> Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) From ewahl at osc.edu Mon Jun 15 14:59:44 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 13:59:44 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Jun 15 15:10:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 15 Jun 2015 14:10:25 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Manilla is one of the projects to provide ?shared? access to file-systems. I thought that at the moment, Manilla doesn?t support the GPFS protocol but is implemented on top of Ganesha so it provided as NFS access. So you wouldn?t get mmunlinkfileset. This sorta brings me back to one of the things I talked about at the GPFS UG, as in the GPFS security model is trusting, which in multi-tenant environments is a bad thing. I know I?ve spoken to a few people recently who?ve commented / agreed / had thoughts on it, so can I ask that if multi-tenancy security is something that you think is of concern with GPFS, can you drop me an email (directly is fine) which your use case and what sort of thing you?d like to see, then I?ll collate this and have a go at talking to IBM again about this. Thanks Simon From: , Edward > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 14:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Mon Jun 15 15:16:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Mon, 15 Jun 2015 15:16:44 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <1434377805.15671.126.camel@buzzard.phy.strath.ac.uk> On Mon, 2015-06-15 at 08:35 +0000, Luke Raimbach wrote: > Dear All, > > We are looking forward to using the manila driver for > auto-provisioning of file shares using GPFS. However, I have some > concerns... > > > Manila presumably gives tenant users access to file system commands > like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset > quiesces the file system, there is potentially an impact from one > tenant on another - i.e. someone unlinking and deleting a lot of > filesets during a tenancy cleanup might cause a cluster pause long > enough to trigger other failure events or even start evicting nodes. > You can see why this would be bad in a cloud environment. Er as far as I can see in the documentation no you don't. My personal experience is mmunlinkfileset has a habit of locking the file system up; aka don't do while the file system is busy. On the other hand mmlinkfileset you can do with gay abandonment. Might have changed in more recent version of GPFS. On the other hand you do get access to creating/deleting snapshots which on the deleting side has in the past for me personally has caused file system lockups. Similarly creating a snapshot no problem. The difference between the two is things that require quiescence to take away from the file system can cause bad things happen. Quiescence to add things to the file system rarely if ever cause problems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From chris.hunter at yale.edu Mon Jun 15 15:35:06 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 15 Jun 2015 10:35:06 -0400 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: <557EE29A.4070909@yale.edu> Although likely not the access model you are seeking, GPFS is mentioned for the swift-on-file project: * https://github.com/stackforge/swiftonfile Openstack Swift uses HTTP/REST protocol for file access (ala S3), not the best choice for data-intensive applications. regards, chris hunter yale hpc group --- Date: Mon, 15 Jun 2015 08:35:18 +0000 From: Luke Raimbach To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE From ross.keeping at uk.ibm.com Mon Jun 15 17:43:07 2015 From: ross.keeping at uk.ibm.com (Ross Keeping3) Date: Mon, 15 Jun 2015 17:43:07 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location Message-ID: Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 360 bytes Desc: not available URL: From ewahl at osc.edu Mon Jun 15 21:35:04 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 20:35:04 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Mon Jun 15 21:38:39 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 15 Jun 2015 15:38:39 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It took me a while to find it too - Key is to search on "Spectrum Scale". Try this URL: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all If you don't want X86, just select the appropriate platform. Bob Oesterlin Nuance Communications On Mon, Jun 15, 2015 at 3:35 PM, Wahl, Edward wrote: > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Tue Jun 16 08:36:56 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 07:36:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jun 16 09:11:23 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:11:23 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: The docs also now seem to be in Spectrum Scale section at: http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html Simon From: Ross Keeping3 > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 17:43 To: "gpfsug-discuss at gpfsug.org" > Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone:(+44 161) 8362381-Line:37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From jonathan at buzzard.me.uk Tue Jun 16 09:40:30 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 09:40:30 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 07:36 +0000, Luke Raimbach wrote: [SNIP] > The tenants don?t have root access to the file system, but the Manila > component must act as a wrapper to file system administrative > equivalents like mmcrfileset, mmdelfileset, link and unlink. The > shares are created as GPFS filesets which are then presented over NFS. > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From adam.huffman at crick.ac.uk Tue Jun 16 09:41:56 2015 From: adam.huffman at crick.ac.uk (Adam Huffman) Date: Tue, 16 Jun 2015 08:41:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: The presentation of shared storage by Manila isn?t necessarily via NFS. Some of the drivers, I believe the GPFS one amongst them, allow some form of native connection either via the guest or via a VirtFS connection to the client on the hypervisor. Best Wishes, Adam ? > On 16 Jun 2015, at 08:36, Luke Raimbach wrote: > > So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. > > The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. > > The unlinking of the fileset worries me for the reasons stated previously. > > From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward > Sent: 15 June 2015 15:00 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver > > Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. > > I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? > > Ed Wahl > OSC > > > > ++ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] > Sent: Monday, June 15, 2015 4:35 AM > To: gpfsug-discuss at gpfsug.org > Subject: [gpfsug-discuss] OpenStack Manila Driver > > Dear All, > > We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... > > Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. > > > Has this scenario been addressed at all? > > Cheers, > Luke. > > > Luke Raimbach? > Senior HPC Data and Storage Systems Engineer > The Francis Crick Institute > Gibbs Building > 215 Euston Road > London NW1 2BE > > E: luke.raimbach at crick.ac.uk > W: www.crick.ac.uk > > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Tue Jun 16 09:46:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:46:52 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I didn;t think that the *current* Manilla driver user GPFS protocol, but sat on top of Ganesha server. Simon On 16/06/2015 09:41, "Adam Huffman" wrote: > >The presentation of shared storage by Manila isn?t necessarily via NFS. >Some of the drivers, I believe the GPFS one amongst them, allow some form >of native connection either via the guest or via a VirtFS connection to >the client on the hypervisor. > >Best Wishes, >Adam > > >? > > > > > >> On 16 Jun 2015, at 08:36, Luke Raimbach >>wrote: >> >> So as I understand things, Manila is an OpenStack component which >>allows tenants to create and destroy shares for their instances which >>would be accessed over NFS. Perhaps I?ve not done enough research in to >>this though ? I?m also not an OpenStack expert. >> >> The tenants don?t have root access to the file system, but the Manila >>component must act as a wrapper to file system administrative >>equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares >>are created as GPFS filesets which are then presented over NFS. >> >> The unlinking of the fileset worries me for the reasons stated >>previously. >> >> From: gpfsug-discuss-bounces at gpfsug.org >>[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward >> Sent: 15 June 2015 15:00 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver >> >> Perhaps I misunderstand here, but if the tenants have administrative >>(ie:root) privileges to the underlying file system management commands I >>think mmunlinkfileset might be a minor concern here. There are FAR more >>destructive things that could occur. >> >> I am not an OpenStack expert and I've not even looked at anything past >>Kilo, but my understanding was that these commands were not necessary >>for tenants. They access a virtual block device that backs to GPFS, >>correct? >> >> Ed Wahl >> OSC >> >> >> >> ++ >> From: gpfsug-discuss-bounces at gpfsug.org >>[gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach >>[Luke.Raimbach at crick.ac.uk] >> Sent: Monday, June 15, 2015 4:35 AM >> To: gpfsug-discuss at gpfsug.org >> Subject: [gpfsug-discuss] OpenStack Manila Driver >> >> Dear All, >> >> We are looking forward to using the manila driver for auto-provisioning >>of file shares using GPFS. However, I have some concerns... >> >> Manila presumably gives tenant users access to file system commands >>like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset >>quiesces the file system, there is potentially an impact from one tenant >>on another - i.e. someone unlinking and deleting a lot of filesets >>during a tenancy cleanup might cause a cluster pause long enough to >>trigger other failure events or even start evicting nodes. You can see >>why this would be bad in a cloud environment. >> >> >> Has this scenario been addressed at all? >> >> Cheers, >> Luke. >> >> >> Luke Raimbach? >> Senior HPC Data and Storage Systems Engineer >> The Francis Crick Institute >> Gibbs Building >> 215 Euston Road >> London NW1 2BE >> >> E: luke.raimbach at crick.ac.uk >> W: www.crick.ac.uk >> >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >The Francis Crick Institute Limited is a registered charity in England >and Wales no. 1140062 and a company registered in England and Wales no. >06885462, with its registered office at 215 Euston Road, London NW1 2BE. >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Luke.Raimbach at crick.ac.uk Tue Jun 16 09:48:50 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 08:48:50 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: [SNIP] >> The tenants don?t have root access to the file system, but the Manila >> component must act as a wrapper to file system administrative >> equivalents like mmcrfileset, mmdelfileset, link and unlink. The >> shares are created as GPFS filesets which are then presented over NFS. >> > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. I think you are right. Looking over the various resources I have available, the creation, deletion, linking and unlinking of filesets is not implemented, but commented on as needing to be done. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From jonathan at buzzard.me.uk Tue Jun 16 10:25:45 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 10:25:45 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: <1434446745.15671.134.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 08:48 +0000, Luke Raimbach wrote: [SNIP] > I think you are right. Looking over the various resources I have > available, the creation, deletion, linking and unlinking of filesets is > not implemented, but commented on as needing to be done. That's going to be a right barrel of laughs as reliability goes out the window if they do implement it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From billowen at us.ibm.com Thu Jun 18 05:22:25 2015 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 17 Jun 2015 22:22:25 -0600 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Hi Luke, Your explanation below is correct, with some minor clarifications Manila is an OpenStack project which allows storage admins to create and destroy filesystem shares and make those available to vm instances and bare metal servers which would be accessed over NFS. The Manila driver runs in the control plane and creates a new gpfs independent fileset for each new share. It provides automation for giving vm's (and also bare metal servers) acces to the shares so that they can mount and use the share. There is work being done to allow automating the mount process when the vm instance boots. The tenants don?t have root access to the file system, but the Manila component acts as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The manila driver uses the following gpfs commands: When a share is created: mmcrfileset mmlinkfileset mmsetquota When a share is deleted: mmunlinkfileset mmdelfileset Snapshots of shares can be created and deleted: mmcrsnapshot mmdelsnapshot Today, the GPFS Manila driver supports creating NFS exports to VMs. We are considering adding native GPFS client support in the VM, but not sure if the benefit justifies the extra complexity of having gpfs client in vm image, and also the impact to cluster as vm's come up and down in a more dynamic way than physical nodes. For multi-tenant deployments, we recommend using a different filesystem per tenant to provide better separation of data, and to minimize the "noisy neighbor" effect for operations like mmunlinkfileset. Here is a presentation that shows an overview of the GPFS Manila driver: (See attached file: OpenStack_Storage_Manila_with_GPFS.pdf) Perhaps this, and other GPFS & OpenStack topics could be the subject of a future user group session. Regards, Bill Owen billowen at us.ibm.com GPFS and OpenStack 520-799-4829 From: Luke Raimbach To: gpfsug main discussion list Date: 06/16/2015 12:37 AM Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Sent by: gpfsug-discuss-bounces at gpfsug.org So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenStack_Storage_Manila_with_GPFS.pdf Type: application/pdf Size: 354887 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 13:30:40 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 12:30:40 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Message-ID: Hi All, Something I am thinking about doing is utilising the placement policy engine to insert custom metadata tags upon file creation, based on which fileset the creation occurs in. This might be to facilitate Research Data Management tasks that could happen later in the data lifecycle. I am also thinking about allowing users to specify additional custom metadata tags (maybe through a fancy web interface) and also potentially give users control over creating new filesets (e.g. for scientists running new experiments). So? pretend this is a placement policy on my GPFS driven data-ingest platform: RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' The fileset name can be meaningless (as far as the user is concerned), but would be linked somewhere nice that they recognise ? say /gpfs/incoming/instrument1. The fileset, when it is created, would also be an AFM cache for its ?home? counterpart which exists on a much larger (also GPFS driven) pool of storage? so that my metadata tags are preserved, you see. This potentially user driven activity might look a bit like this: - User logs in to web interface and creates new experiment - Filesets (system-generated names) are created on ?home? and ?ingest? file systems and linked into the directory namespace wherever the user specifies - AFM relationships are set up and established for the ingest (cache) fileset to write back to the AFM home fileset (probably Independent Writer mode) - A set of ?default? policies are defined and installed on the cache file system to tag data for that experiment (the user can?t change these) - The user now specifies additional metadata tags they want added to their experiment data (some of this might be captured through additional mandatory fields in the web form for instance) - A policy for later execution by mmapplypolicy on the AFM home file system is created which looks for the tags generated at ingest-time and applies the extra user-defined tags There?s much more that would go on later in the lifecycle to take care of automated HSM tiering, data publishing, movement and cataloguing of data onto external non GPFS file systems, etc. but I won?t go in to it here. My GPFS related questions are: When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. What is the specific limitation for having a policy placement file no larger than 1MB? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 14:18:34 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 09:18:34 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Jun 18 14:27:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Thu, 18 Jun 2015 13:27:52 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: I can see exactly where Luke?s suggestion would be applicable. We might have several hundred active research projects which would have some sort of internal identifier, so I can see why you?d want to do this sort of tagging as it would allow a policy scan to find files related to specific projects (for example). Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 14:18 To: gpfsug main discussion list >, "luke.raimbach at crick.ac.uk" > Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 14:35:32 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 13:35:32 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. From: Marc A Kaplan [mailto:makaplan at us.ibm.com] Sent: 18 June 2015 14:19 To: gpfsug main discussion list; Luke Raimbach Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach > ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo_fumagalli at it.ibm.com Thu Jun 18 14:38:00 2015 From: massimo_fumagalli at it.ibm.com (Massimo Fumagalli) Date: Thu, 18 Jun 2015 15:38:00 +0200 Subject: [gpfsug-discuss] ILM question Message-ID: Please, I need to know a simple question. Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to library). Then we need to read a file that has been moved to library (or Tier1). Will be file copied back to Tier 0? Or read will be executed directly from Library or Tier1 ? since there can be performance issue Regards Max IBM Italia S.p.A. Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) Cap. Soc. euro 347.256.998,80 C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 Societ? con unico azionista Societ? soggetta all?attivit? di direzione e coordinamento di International Business Machines Corporation (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From ewahl at osc.edu Thu Jun 18 15:08:29 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 14:08:29 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) Ed Wahl OSC From Paul.Sanchez at deshaw.com Thu Jun 18 15:52:07 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Thu, 18 Jun 2015 14:52:07 +0000 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation In-Reply-To: <557ED229.50902@gpfsug.org> References: <557ED229.50902@gpfsug.org> Message-ID: <201D6001C896B846A9CFC2E841986AC1454124B2@mailnycmb2a.winmail.deshaw.com> Thanks Jez, D. E. Shaw is based in New York, NY. We have 3-4 engineers/architects who would attend. Additionally, if you haven't heard from D. E. Shaw Research, they're next-door and have another 2. -Paul Sanchez Sent with Good (www.good.com) ________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Jez Tucker (Chair) Sent: Monday, June 15, 2015 9:24:57 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 16:36:49 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 11:36:49 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: (1) There is no secret flag. I assume that the existing policy is okay but the new one is better. So start using the better one ASAP, but why stop the system if you don't have to? The not secret way to quiesce/resume a filesystem without unmounting is fsctl {suspend | suspend-write | resume}; (2) The policy rules text is passed as a string through a GPFS rpc protocol (not a standard RPC) and the designer/coder chose 1MB as a safety-limit. I think it could be increased, but suppose you did have 4000 rules, each 200 bytes - you'd be at 800KB, still short of the 1MB limit. (x) Personally, I wouldn't worry much about setting, say 10 extended attribute values in each rule. I'd worry more about the impact of having 100s of rules. (y) When designing/deploying a new GPFS filesystem, consider explicitly setting the inode size so that all anticipated extended attributes will be stored in the inode, rather than spilling into other disk blocks. See mmcrfs ... -i InodeSize. You can build a test filesystem with just one NSD/LUN and test your anticipated usage. Use tsdbfs ... xattr ... to see how EAs are stored. Caution: tsdbfs display commands are harmless, BUT there are some patch and patch-like subcommands that could foul up your filesystem. From: Luke Raimbach Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Thu Jun 18 17:02:54 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:02:54 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Sorry to hear about the problems you've had recently. It's frustrating when that happens. I didn't have the exactly the same situation, but we had something similar which may bring some light to a missing disks situation: We had a dataOnly storage pool backed by a few building blocks each consisting of several RAID controllers that were each direct attached to a few servers. We had several of these sets all in one pool. Thus, if a server failed it was fine, if a single link failed, it was fine. Potentially we could do copies=2 and have multiple failure groups in a single pool. If anything in the RAID arrays themselves failed, it was OK, but a single whole RAID controller going down would take that section of disks down. The number of copies was set to 1 on this pool. One RAID controller went down, but the file system as a whole stayed online. Our user experience was that Some users got IO errors of a "file inaccessible" type (I don't remember the exact code). Other users, and especially those mostly in other tiers continued to work as normal. As we had mostly small files across this tier ( much smaller than the GPFS block size ), most of the files were in one of the RAID controllers or another, thus not striping really, so even the files in other controllers on the same tier were also fine and accessible. Bottom line is: Only the files that were missing gave errors, the others were fine. Additionally, for missing files errors were reported which apps could capture and do something about, wait, or retry later -- not a D state process waiting forever or stale file handles. I'm not saying this is the best way. We didn't intend for this to happen. I suspect that stopping the disk would result in a similar experience but more safely. We asked GPFS devs if we needed to fsck after this since the tier just went offline directly and we continued to use the rest of the system while it was gone.. they said no it should be fine and missing blocks will be taken care of. I assume this is true, but I have no explicit proof, except that it's still working and nothing seemed to be missing. I guess some questions for the dev's would be: * Is this safe / advisable to do the above either directly or via a stop and then down the array? * Given that there is some client-side write caching in GPFS, if a file is being written and an expected final destination goes offline mid-write, where does the block go? + If a whole pool goes offline, will it pick another pool or error? + If it's a disk in a pool, will it reevaluate and round-robin to the next disk, or just fail since it had already decided where to write? Hope this helps a little. On Thu, Jun 18, 2015 at 10:08 AM, Wahl, Edward wrote: > We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? > > In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. > > I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. > > I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) > > Ed Wahl > OSC > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From zgiles at gmail.com Thu Jun 18 17:06:33 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:06:33 -0400 Subject: [gpfsug-discuss] ILM question In-Reply-To: References: Message-ID: I would expect it to return to one of your online tiers. If you tier between two storage pools, you can directly read and write those files. Think of how LTFS works -- it's an external storage pool, so you need to run an operation via an external command to give the file back to GPFS from which you can read it. This is controlled via the policies and I assume you would need to make a policy to specify where the file would be placed when it comes back. It would be fancy for someone to allow reading directly from an external pool, but as far as I know, it has to hit a disk first. What I don't know is: Will it begin streaming the files back to the user as the blocks hit the disk, while other blocks are still coming in, or must the whole file be recalled first? On Thu, Jun 18, 2015 at 9:38 AM, Massimo Fumagalli < massimo_fumagalli at it.ibm.com> wrote: > Please, I need to know a simple question. > > Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating > files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to > library). > Then we need to read a file that has been moved to library (or Tier1). > Will be file copied back to Tier 0? Or read will be executed directly from > Library or Tier1 ? since there can be performance issue > > Regards > Max > > > IBM Italia S.p.A. > Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) > Cap. Soc. euro 347.256.998,80 > C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 > Societ? con unico azionista > Societ? soggetta all?attivit? di direzione e coordinamento di > International Business Machines Corporation > > (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- Zach Giles zgiles at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From chekh at stanford.edu Thu Jun 18 21:26:17 2015 From: chekh at stanford.edu (Alex Chekholko) Date: Thu, 18 Jun 2015 13:26:17 -0700 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <55832969.4050901@stanford.edu> mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu From ewahl at osc.edu Thu Jun 18 21:36:48 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 20:36:48 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <55832969.4050901@stanford.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <55832969.4050901@stanford.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58A6A@CIO-KRC-D1MBX02.osuad.osu.edu> I'm not sure it's so uncommon, but yes. (and your line looks suspiciously like mine did) I've had other situations where it would have been nice to do maintenance on a single storage pool. Maybe this is a "scale" issue when you get too large and should maybe have multiple file systems instead? Single name space is nice for users though. Plus I was curious what others had done in similar situations. I guess I could do what IBM does and just write the stupid script, name it "ts-something" and put a happy wrapper up front with a mm-something name. ;) Just FYI: 'suspend' does NOT stop I/O. Only stops new block creation,so 'stop' was what I did. >From the man page: "...Existing data on a suspended disk may still be read or updated." Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Alex Chekholko [chekh at stanford.edu] Sent: Thursday, June 18, 2015 4:26 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Thu Jun 18 22:01:01 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 17:01:01 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: From stijn.deweirdt at ugent.be Fri Jun 19 08:18:31 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 09:18:31 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <5583C247.1090609@ugent.be> > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. just this week we had an issue with bad disk (of the non-failing but disrupting everything kind) and issues with the raid controller (db of both controllers corrupted due to the one disk, controller reboot loops etc etc). but tech support pulled it through, although it took a while. i'm amased what can be done with the hardware controllers (and i've seen my share of recoveries ;) my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). stijn > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Luke.Raimbach at crick.ac.uk Fri Jun 19 08:47:10 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Fri, 19 Jun 2015 07:47:10 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <5583C247.1090609@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <5583C247.1090609@ugent.be> Message-ID: my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). Ooh, we have a new one that's not in production yet. IBM say the latest GSS code should allow for a whole enclosure failure. I might try it before going in to production. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Fri Jun 19 14:31:17 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:31:17 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Do you mean in GNR compared to using (san/IB) based hardware RAIDs? If so, then GNR isn?t a scale-out solution - you buy a ?unit? and can add another ?unit? to the namespace, but I can?t add another 30TB of storage (say a researcher with a grant), where as with SAN based RAID controllers, I can go off and buy another storage shelf. Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 22:01 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Jun 19 14:51:32 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 09:51:32 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: 1. YES, Native Raid can recover from various failures: drawers, cabling, controllers, power supplies, etc, etc. Of course it must be configured properly so that there is no possible single point of failure. But yes, you should get your hands on a test rig and try out (simulate) various failure scenarios and see how well it works. 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. 3. If you'd like to know and/or explore more, read the pubs, do the experiments, and/or contact the IBM sales and support people. IF by some chance you do not get satisfactory answers, come back here perhaps we can get your inquiries addressed by the GPFS design team. Like other complex products, there are bound to be some questions that the sales and marketing people can't quite address. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Jun 19 14:56:33 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:56:33 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Jun 19 15:37:13 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 19 Jun 2015 14:37:13 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Fri Jun 19 15:49:32 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 14:49:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <201506191450.t5JEovUS018695@d01av01.pok.ibm.com> GNR today is only sold as a packaged solution e.g. ESS. The reason its not sold as SW only today is technical and its not true that this is not been pursued, its just not there yet and we cant discuss plans on a mailinglist. Sven Sent from IBM Verse Wahl, Edward --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Wahl, Edward" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:41 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan Reply-To: gpfsug main discussion list Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Fri Jun 19 15:56:19 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 10:56:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com From jtucker at pixitmedia.com Fri Jun 19 16:05:24 2015 From: jtucker at pixitmedia.com (Jez Tucker (Chair)) Date: Fri, 19 Jun 2015 16:05:24 +0100 Subject: [gpfsug-discuss] Handing over chair@ Message-ID: <55842FB4.9030705@gpfsug.org> Hello all This is my last post as Chair for the foreseeable future. The next will come from Simon Thompson who assumes the post today for the next two years. I'm looking forward to Simon's tenure and wish him all the best with his endeavours. Myself, I'm moving over to UG Media Rep and will continue to support the User Group and committee in its efforts. My new email is jez.tucker at gpfsug.org Please keep sending through your City and Country locations, they're most helpful. Have a great weekend. All the best, Jez -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oehmes at us.ibm.com Fri Jun 19 16:09:41 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 15:09:41 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: Message-ID: <201506191510.t5JFAdW2021605@d03av04.boulder.ibm.com> Reporting is not the issue, one of the main issue is that we can't talk to the enclosure, which results in loosing the capability to replace disk drive or turn any fault indicators on. It also prevents us to 'read' the position of a drive within a tray or fault domain within a enclosure, without that information we can't properly determine where we need to place strips of a track to prevent data access loss in case a enclosure or component fails. Sven Sent from IBM Verse Zachary Giles --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Zachary Giles" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:56 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Fri Jun 19 16:15:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:15:44 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: > I think it's technically possible to run GNR on unsupported trays. You > may have to do some fiddling with some of the scripts, and/or you wont > get proper reporting. > Of course it probably violates 100 licenses etc etc etc. > I don't know of anyone who's done it yet. I'd like to do it.. I think > it would be great to learn it deeper by doing this. > One imagines that GNR uses the SCSI enclosure services to talk to the shelves. https://en.wikipedia.org/wiki/SCSI_Enclosure_Services https://en.wikipedia.org/wiki/SES-2_Enclosure_Management Which would suggest that anything that supported these would work. I did some experimentation with a spare EXP810 shelf a few years ago on a FC-AL on Linux. Kind all worked out the box. The other experiment with an EXP100 didn't work so well; with the EXP100 it would only work with the 250GB and 400GB drives that came with the dam thing. With the EXP810 I could screw random SATA drives into it and it all worked. My investigations concluded that the firmware on the EXP100 shelf determined if the drive was supported, but I could not work out how to upload modified firmware to the shelf. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From stijn.deweirdt at ugent.be Fri Jun 19 16:23:18 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 17:23:18 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <558433E6.8030708@ugent.be> hi marc, > 1. YES, Native Raid can recover from various failures: drawers, cabling, > controllers, power supplies, etc, etc. > Of course it must be configured properly so that there is no possible > single point of failure. hmmm, this is not really what i was asking about. but maybe it's easier in gss to do this properly (eg for 8+3 data protection, you only need 11 drawers if you can make sure the data+parity blocks are send to different drawers (sort of per drawer failure group, but internal to the vdisks), and the smallest setup is a gss24 which has 20 drawers). but i can't rememeber any manual suggestion the admin can control this (or is it the default?). anyway, i'm certainly interested in any config whitepapers or guides to see what is required for such setup. are these public somewhere? (have really searched for them). > > But yes, you should get your hands on a test rig and try out (simulate) > various failure scenarios and see how well it works. is there a way besides presales to get access to such setup? stijn > > 2. I don't know the details of the packaged products, but I believe you > can license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire > or need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > 3. If you'd like to know and/or explore more, read the pubs, do the > experiments, and/or contact the IBM sales and support people. > IF by some chance you do not get satisfactory answers, come back here > perhaps we can get your inquiries addressed by the > GPFS design team. Like other complex products, there are bound to be some > questions that the sales and marketing people > can't quite address. > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jonathan at buzzard.me.uk Fri Jun 19 16:35:32 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:35:32 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <558433E6.8030708@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> Message-ID: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 17:23 +0200, Stijn De Weirdt wrote: > hi marc, > > > > 1. YES, Native Raid can recover from various failures: drawers, cabling, > > controllers, power supplies, etc, etc. > > Of course it must be configured properly so that there is no possible > > single point of failure. > hmmm, this is not really what i was asking about. but maybe it's easier > in gss to do this properly (eg for 8+3 data protection, you only need 11 > drawers if you can make sure the data+parity blocks are send to > different drawers (sort of per drawer failure group, but internal to the > vdisks), and the smallest setup is a gss24 which has 20 drawers). > but i can't rememeber any manual suggestion the admin can control this > (or is it the default?). > I got the impression that GNR was more in line with the Engenio dynamic disk pools http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf That is traditional RAID sucks with large numbers of big drives. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From bsallen at alcf.anl.gov Fri Jun 19 17:05:15 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 16:05:15 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. Ben > On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: > > On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >> I think it's technically possible to run GNR on unsupported trays. You >> may have to do some fiddling with some of the scripts, and/or you wont >> get proper reporting. >> Of course it probably violates 100 licenses etc etc etc. >> I don't know of anyone who's done it yet. I'd like to do it.. I think >> it would be great to learn it deeper by doing this. >> > > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. > > https://en.wikipedia.org/wiki/SCSI_Enclosure_Services > https://en.wikipedia.org/wiki/SES-2_Enclosure_Management > > Which would suggest that anything that supported these would work. > > I did some experimentation with a spare EXP810 shelf a few years ago on > a FC-AL on Linux. Kind all worked out the box. The other experiment with > an EXP100 didn't work so well; with the EXP100 it would only work with > the 250GB and 400GB drives that came with the dam thing. With the EXP810 > I could screw random SATA drives into it and it all worked. My > investigations concluded that the firmware on the EXP100 shelf > determined if the drive was supported, but I could not work out how to > upload modified firmware to the shelf. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Fri Jun 19 17:09:44 2015 From: peserocka at gmail.com (Pete Sero) Date: Sat, 20 Jun 2015 00:09:44 +0800 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: vi my_enclosures.conf fwiw Peter On 2015 Jun 20 Sat, at 24:05, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 17:12:53 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 12:12:53 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: Ya, that's why I mentioned you'd probably have to fiddle with some scripts or something to help GNR figure out where disks are. Is definitely known that you can't just use any random enclosure given that GNR depends highly on the topology. Maybe in the future there would be a way to specify the topology or that a drive is at a specific position. On Fri, Jun 19, 2015 at 12:05 PM, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From makaplan at us.ibm.com Fri Jun 19 19:45:19 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 14:45:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stijn.deweirdt at ugent.be Fri Jun 19 20:01:04 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 21:01:04 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> Message-ID: <558466F0.8000300@ugent.be> >>> 1. YES, Native Raid can recover from various failures: drawers, cabling, >>> controllers, power supplies, etc, etc. >>> Of course it must be configured properly so that there is no possible >>> single point of failure. >> hmmm, this is not really what i was asking about. but maybe it's easier >> in gss to do this properly (eg for 8+3 data protection, you only need 11 >> drawers if you can make sure the data+parity blocks are send to >> different drawers (sort of per drawer failure group, but internal to the >> vdisks), and the smallest setup is a gss24 which has 20 drawers). >> but i can't rememeber any manual suggestion the admin can control this >> (or is it the default?). >> > > I got the impression that GNR was more in line with the Engenio dynamic > disk pools well, it's uses some crush-like placement and some parity encoding scheme (regular raid6 for the DDP, some flavour of EC for GNR), but other then that, not much resemblence. DDP does not give you any control over where the data blocks are stored. i'm not sure about GNR, (but DDP does not state anywhere they are drawer failure proof ;). but GNR is more like a DDP then e.g. a ceph EC pool, in the sense that the hosts needs to see all disks (similar to the controller that needs access to the disks). > > http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx > > http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf > > That is traditional RAID sucks with large numbers of big drives. (btw it's one of those that we saw fail (and get recovered by tech support!) this week. tip of the week: turn on the SMmonitor service on at least one host, it's actually useful for something). stijn > > > JAB. > From S.J.Thompson at bham.ac.uk Fri Jun 19 20:17:32 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 19:17:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>, Message-ID: My understanding I that GSS and IBM ESS are sold as pre configured systems. So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] Sent: 19 June 2015 19:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? From zgiles at gmail.com Fri Jun 19 21:08:14 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 16:08:14 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From S.J.Thompson at bham.ac.uk Fri Jun 19 22:08:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 21:08:25 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: I'm not disputing that gnr is a cool technology. Just that as scale out, it doesn't work for our funding model. If we go back to the original question, if was pros and cons of gnr vs raid type storage. My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] Sent: 19 June 2015 21:08 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Fri Jun 19 22:18:51 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:18:51 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: References: Message-ID: <5584873B.1080109@yale.edu> Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group From zgiles at gmail.com Fri Jun 19 22:35:59 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 17:35:59 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OK, back on topic: Honestly, I'm really glad you said that. I have that exact problem also -- a researcher will be funded for xTB of space, and we are told by the grants office that if something is purchased on a grant it belongs to them and it should have a sticker put on it that says "property of the govt' etc etc. We decided to (as an institution) put the money forward to purchase a large system ahead of time, and as grants come in, recover the cost back into the system by paying off our internal "negative balance". In this way we can get the benefit of a large storage system like performance and purchasing price, but provision storage into quotas as needed. We can even put stickers on a handful of drives in the GSS tray if that makes them feel happy. Could they request us to hand over their drives and take them out of our system? Maybe. if the Grants Office made us do it, sure, I'd drain some pools off and go hand them over.. but that will never happen because it's more valuable to them in our cluster than sitting on their table, and I'm not going to deliver the drives full of their data. That's their responsibility. Is it working? Yeah, but, I'm not a grants admin nor an accountant, so I'll let them figure that out, and they seem to be OK with this model. And yes, it's not going to work for all institutions unless you can put the money forward upfront, or do a group purchase at the end of a year. So I 100% agree, GNR doesn't really fit the model of purchasing a few drives at a time, and the grants things is still a problem. On Fri, Jun 19, 2015 at 5:08 PM, Simon Thompson (Research Computing - IT Services) wrote: > I'm not disputing that gnr is a cool technology. > > Just that as scale out, it doesn't work for our funding model. > > If we go back to the original question, if was pros and cons of gnr vs raid type storage. > > My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. > > And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] > Sent: 19 June 2015 21:08 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chris.hunter at yale.edu Fri Jun 19 22:57:14 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:57:14 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: References: Message-ID: <5584903A.3020203@yale.edu> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. A JBOD solution that allows incremental drive expansion is desirable. chris hunter yale hpc group > From: Zachary Giles > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > OK, back on topic: > Honestly, I'm really glad you said that. I have that exact problem > also -- a researcher will be funded for xTB of space, and we are told > by the grants office that if something is purchased on a grant it > belongs to them and it should have a sticker put on it that says > "property of the govt' etc etc. > We decided to (as an institution) put the money forward to purchase a > large system ahead of time, and as grants come in, recover the cost > back into the system by paying off our internal "negative balance". In > this way we can get the benefit of a large storage system like > performance and purchasing price, but provision storage into quotas as > needed. We can even put stickers on a handful of drives in the GSS > tray if that makes them feel happy. > Could they request us to hand over their drives and take them out of > our system? Maybe. if the Grants Office made us do it, sure, I'd drain > some pools off and go hand them over.. but that will never happen > because it's more valuable to them in our cluster than sitting on > their table, and I'm not going to deliver the drives full of their > data. That's their responsibility. > > Is it working? Yeah, but, I'm not a grants admin nor an accountant, so > I'll let them figure that out, and they seem to be OK with this model. > And yes, it's not going to work for all institutions unless you can > put the money forward upfront, or do a group purchase at the end of a > year. > > So I 100% agree, GNR doesn't really fit the model of purchasing a few > drives at a time, and the grants things is still a problem. From jhick at lbl.gov Fri Jun 19 23:18:56 2015 From: jhick at lbl.gov (Jason Hick) Date: Fri, 19 Jun 2015 15:18:56 -0700 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <5584903A.3020203@yale.edu> References: <5584903A.3020203@yale.edu> Message-ID: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. As opposed to dealing with racks of storage and architectural details. Jason > On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: > > I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. > > I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. > > We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. > > A JBOD solution that allows incremental drive expansion is desirable. > > chris hunter > yale hpc group > >> From: Zachary Giles >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >> themselves? How about GPFS Native Raid? >> >> OK, back on topic: >> Honestly, I'm really glad you said that. I have that exact problem >> also -- a researcher will be funded for xTB of space, and we are told >> by the grants office that if something is purchased on a grant it >> belongs to them and it should have a sticker put on it that says >> "property of the govt' etc etc. >> We decided to (as an institution) put the money forward to purchase a >> large system ahead of time, and as grants come in, recover the cost >> back into the system by paying off our internal "negative balance". In >> this way we can get the benefit of a large storage system like >> performance and purchasing price, but provision storage into quotas as >> needed. We can even put stickers on a handful of drives in the GSS >> tray if that makes them feel happy. >> Could they request us to hand over their drives and take them out of >> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >> some pools off and go hand them over.. but that will never happen >> because it's more valuable to them in our cluster than sitting on >> their table, and I'm not going to deliver the drives full of their >> data. That's their responsibility. >> >> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >> I'll let them figure that out, and they seem to be OK with this model. >> And yes, it's not going to work for all institutions unless you can >> put the money forward upfront, or do a group purchase at the end of a >> year. >> >> So I 100% agree, GNR doesn't really fit the model of purchasing a few >> drives at a time, and the grants things is still a problem. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 23:54:39 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 18:54:39 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> References: <5584903A.3020203@yale.edu> <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> Message-ID: Starting to sound like Seagate/Xyratex there. :) On Fri, Jun 19, 2015 at 6:18 PM, Jason Hick wrote: > For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. > > As opposed to dealing with racks of storage and architectural details. > > Jason > >> On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: >> >> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. >> >> I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. >> >> We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. >> >> A JBOD solution that allows incremental drive expansion is desirable. >> >> chris hunter >> yale hpc group >> >>> From: Zachary Giles >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >>> themselves? How about GPFS Native Raid? >>> >>> OK, back on topic: >>> Honestly, I'm really glad you said that. I have that exact problem >>> also -- a researcher will be funded for xTB of space, and we are told >>> by the grants office that if something is purchased on a grant it >>> belongs to them and it should have a sticker put on it that says >>> "property of the govt' etc etc. >>> We decided to (as an institution) put the money forward to purchase a >>> large system ahead of time, and as grants come in, recover the cost >>> back into the system by paying off our internal "negative balance". In >>> this way we can get the benefit of a large storage system like >>> performance and purchasing price, but provision storage into quotas as >>> needed. We can even put stickers on a handful of drives in the GSS >>> tray if that makes them feel happy. >>> Could they request us to hand over their drives and take them out of >>> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >>> some pools off and go hand them over.. but that will never happen >>> because it's more valuable to them in our cluster than sitting on >>> their table, and I'm not going to deliver the drives full of their >>> data. That's their responsibility. >>> >>> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >>> I'll let them figure that out, and they seem to be OK with this model. >>> And yes, it's not going to work for all institutions unless you can >>> put the money forward upfront, or do a group purchase at the end of a >>> year. >>> >>> So I 100% agree, GNR doesn't really fit the model of purchasing a few >>> drives at a time, and the grants things is still a problem. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From bsallen at alcf.anl.gov Sat Jun 20 00:12:53 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 23:12:53 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: <5584873B.1080109@yale.edu> References: , <5584873B.1080109@yale.edu> Message-ID: <3a261dc3-e8a4-4550-bab2-db4cc0ffbaea@alcf.anl.gov> Let me know what specific questions you have. Ben From: Chris Hunter Sent: Jun 19, 2015 4:18 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From viccornell at gmail.com Sat Jun 20 22:12:53 2015 From: viccornell at gmail.com (Vic Cornell) Date: Sat, 20 Jun 2015 22:12:53 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Just to make sure everybody is up to date on this, (I work for DDN BTW): > On 19 Jun 2015, at 21:08, Zachary Giles wrote: > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. With the 12K you can buy 1,2,3,4,5,,10 or 20. With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. Happy to expand on any of this on or offline. Vic > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Sat Jun 20 23:40:58 2015 From: zgiles at gmail.com (Zachary Giles) Date: Sat, 20 Jun 2015 18:40:58 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Message-ID: All true. I wasn't trying to knock DDN or say "it can't be done", it's just (probably) not very efficient or cost effective to buy a 12K with 30 drives (as an example). The new 7700 looks like a really nice base a small building block. I had forgot about them. There is a good box for adding 4U at a time, and with 60 drives per enclosure, if you saturated it out at ~3 enclosure / 180 drives, you'd have 1PB, which is also a nice round building block size. :thumb up: On Sat, Jun 20, 2015 at 5:12 PM, Vic Cornell wrote: > Just to make sure everybody is up to date on this, (I work for DDN BTW): > >> On 19 Jun 2015, at 21:08, Zachary Giles wrote: >> >> It's comparable to other "large" controller systems. Take the DDN >> 10K/12K for example: You don't just buy one more shelf of disks, or 5 >> disks at a time from Walmart. You buy 5, 10, or 20 trays and populate >> enough disks to either hit your bandwidth or storage size requirement. > > With the 12K you can buy 1,2,3,4,5,,10 or 20. > > With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. > > GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > >> Generally changing from 5 to 10 to 20 requires support to come on-site >> and recable it, and generally you either buy half or all the disks >> slots worth of disks. > > You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > >> The whole system is a building block and you buy >> N of them to get up to 10-20PB of storage. >> GSS is the same way, there are a few models and you just buy a packaged one. >> >> Technically, you can violate the above constraints, but then it may >> not work well and you probably can't buy it that way. >> I'm pretty sure DDN's going to look at you funny if you try to buy a >> 12K with 30 drives.. :) > > Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. > > Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. > > Happy to expand on any of this on or offline. > > Vic > > >> >> For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save >> money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with >> buildin RAID, a pair of servers, and forget GNR. >> Or maybe GSS22? :) >> >> From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 >> " >> Current high-density storage Models 24 and 26 remain available >> Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u >> JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) >> 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available >> 200 GB and 800 GB SSDs are also available >> The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, >> 26s is comprised of SSD drives or 1.2 TB hard SAS drives >> " >> >> >> On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - >> IT Services) wrote: >>> >>> My understanding I that GSS and IBM ESS are sold as pre configured systems. >>> >>> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >>> >>> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >>> >>> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >>> >>> Simon >>> ________________________________________ >>> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >>> Sent: 19 June 2015 19:45 >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >>> >>> OOps... here is the official statement: >>> >>> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >>> >>> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> -- >> Zach Giles >> zgiles at gmail.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chair at gpfsug.org Mon Jun 22 08:57:49 2015 From: chair at gpfsug.org (GPFS UG Chair) Date: Mon, 22 Jun 2015 08:57:49 +0100 Subject: [gpfsug-discuss] chair@GPFS UG Message-ID: Hi all, Just to follow up from Jez's email last week I'm now taking over as chair of the group. I'd like to thank Jez for his work with the group over the past couple of years in developing it to where it is now (as well as Claire who is staying on as secretary!). We're still interested in sector reps for the group, so if you are a GPFS user in a specific sector and would be interested in this, please let me know. As there haven't really been any sector reps before, we'll see how that works out, but I can't see it being a lot of work! On the US side of things, I need to catch up with Jez and Claire to see where things are up to. And finally, just as a quick head's up, we're pencilled in to have a user group mini (2hr) meeting in the UK in December as one of the breakout groups at the annual MEW event, once the dates for this are published I'll send out a save the date. If you are a user and interested in speaking, also let me know as well as anything else you might like to see there. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Jun 22 14:04:23 2015 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 22 Jun 2015 13:04:23 +0000 Subject: [gpfsug-discuss] Placement Policy Installation andRDMConsiderations In-Reply-To: References: , Message-ID: <201506221305.t5MD5Owv014072@d01av05.pok.ibm.com> An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:28:22 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:28:22 +0100 Subject: [gpfsug-discuss] LROC Express Message-ID: <55882996.6050903@pixitmedia.com> Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oester at gmail.com Mon Jun 22 16:36:08 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:36:08 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882996.6050903@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> Message-ID: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: > Hi All, > > Very quick question for those in the know - does LROC require a standard > license, or will it work with Express? I can't find anything in the FAQ > regarding this so I presume Express is ok, but wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:39:49 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:39:49 +0100 Subject: [gpfsug-discuss] LROC Express In-Reply-To: References: <55882996.6050903@pixitmedia.com> Message-ID: <55882C45.6090501@pixitmedia.com> Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: > It works with Standard edition, just make sure you have the right > license for the nodes using LROC. > > Bob Oesterlin > Nuance COmmunications > > > Bob Oesterlin > > > On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: > > Hi All, > > Very quick question for those in the know - does LROC require a > standard license, or will it work with Express? I can't find > anything in the FAQ regarding this so I presume Express is ok, but > wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the > exclusive attention of the addressee(s) indicated. If you are not > the intended recipient, this email should not be read or disclosed > to any other person. Please notify the sender immediately and > delete this email from your computer system. Any opinions > expressed are not necessarily those of the company from which this > email was sent and, whilst to the best of our knowledge no viruses > or defects exist, no responsibility can be accepted for any loss > or damage arising from its receipt or subsequent use of this email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oester at gmail.com Mon Jun 22 16:45:33 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:45:33 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: I only have a Standard Edition, so I can't say for sure. I do know it's Linux x86 only. This doesn't seem to say directly either: http://www-01.ibm.com/support/knowledgecenter/SSFKCN/gpfs4104/gpfsclustersfaq.html%23lic41?lang=en Bob Oesterlin On Mon, Jun 22, 2015 at 10:39 AM, Barry Evans wrote: > Hi Bob, > > Thanks for this, just to confirm does this mean that it *does not* work > with express? > > Cheers, > Barry > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Mon Jun 22 23:57:10 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Mon, 22 Jun 2015 22:57:10 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [http://www.pixitmedia.com/sig/sig-cio.jpg] This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Tue Jun 23 00:14:09 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Mon, 22 Jun 2015 16:14:09 -0700 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> Message-ID: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ From: "Sanchez, Paul" To: gpfsug main discussion list Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Paul.Sanchez at deshaw.com Tue Jun 23 15:10:31 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Tue, 23 Jun 2015 14:10:31 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418BC0@mailnycmb2a.winmail.deshaw.com> Hi Sven, Yes, I think that fileset level include/exclude would be sufficient for us. It also begs the question about the same for write caching. We haven?t experimented with it yet, but are looking forward to employing HAWC for scratch-like workloads. Do you imagine providing the same sort of HAWC bypass include/exclude to be part of this? That might be useful for excluding datasets where the write ingest rate isn?t massive and the degree of risk we?re comfortable with potential data recovery issues in the face of complex outages may be much lower. Thanks, Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Sven Oehme Sent: Monday, June 22, 2015 7:14 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ [Inactive hide details for "Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we]"Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we?re also running standard. But as a simple t From: "Sanchez, Paul" > To: gpfsug main discussion list > Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org ________________________________ I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From ewahl at osc.edu Tue Jun 23 15:11:11 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 23 Jun 2015 14:11:11 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: , <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> FYI this page causes problems with various versions of Chrome and Firefox (too lazy to test other browsers, sorry) Seems to be a javascript issue. Huge surprise, right? I've filed bugs on the browser sides for FF, don't care about chrome sorry. Ed Wahl OSC ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ewahl at osc.edu] Sent: Monday, June 15, 2015 4:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 4.1.1 fix central location When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Tue Jun 23 15:16:10 2015 From: oester at gmail.com (Bob Oesterlin) Date: Tue, 23 Jun 2015 09:16:10 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Try here: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all Bob Oesterlin On Tue, Jun 23, 2015 at 9:11 AM, Wahl, Edward wrote: > FYI this page causes problems with various versions of Chrome and > Firefox (too lazy to test other browsers, sorry) Seems to be a javascript > issue. Huge surprise, right? > > I've filed bugs on the browser sides for FF, don't care about chrome > sorry. > > Ed Wahl > OSC > > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ > ewahl at osc.edu] > *Sent:* Monday, June 15, 2015 4:35 PM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] 4.1.1 fix central location > > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From orlando.richards at ed.ac.uk Wed Jun 24 12:27:25 2015 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Wed, 24 Jun 2015 12:27:25 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <558A941D.2010206@ed.ac.uk> Hi all, I'm looking to deploy to RedHat 7.1, but from the GPFS FAQ only versions 4.1.1 and 3.5.0-26 are supported. I can't see a release of 3.5.0-26 on the fix central website - does anyone know if this is available? Will 3.5.0-25 work okay on RH7.1? How about 4.1.0-x - any plans to support that on RH7.1? ------- Orlando. On 16/06/15 09:11, Simon Thompson (Research Computing - IT Services) wrote: > The docs also now seem to be in Spectrum Scale section at: > > http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html > > Simon > > From: Ross Keeping3 > > Reply-To: gpfsug main discussion list > > Date: Monday, 15 June 2015 17:43 > To: "gpfsug-discuss at gpfsug.org " > > > Subject: [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central > you will likely be disappointed. Work is ongoing to ensure this becomes > more intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab > *Phone:*(+44 161) 8362381*-Line:*37642381* > E-mail: *ross.keeping at uk.ibm.com > IBM > > 3rd Floor, Maybrook House > > Manchester, M3 2EG > > United Kingdom > > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Services Manager Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From chris.hunter at yale.edu Wed Jun 24 18:26:11 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Wed, 24 Jun 2015 13:26:11 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Message-ID: <558AE833.6070803@yale.edu> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group From ewahl at osc.edu Wed Jun 24 18:47:19 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Wed, 24 Jun 2015 17:47:19 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5A174@CIO-KRC-D1MBX02.osuad.osu.edu> Both are available to you directly. in Linux anyway. My AIX knowledge is decades old. And yes, the HBAs have much more availability/data of course. What kind of monitoring are you looking to do? Fault? Take the data and ?? nagios/cactii/ganglia/etc? Mine it with Splunk? Expand the GPFS Monitor suite? sourceforge.net/projects/gpfsmonitorsuite (though with sourceforge lately, perhaps we should ask Pam et al. to move them?) Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Chris Hunter [chris.hunter at yale.edu] Sent: Wednesday, June 24, 2015 1:26 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bsallen at alcf.anl.gov Wed Jun 24 18:48:47 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Wed, 24 Jun 2015 17:48:47 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <64F35432-4DFC-452C-8965-455BCF7E2F09@alcf.anl.gov> Checkout https://github.com/leibler/check_mk-sas2ircu. This is obviously check_mk specific, but a reasonable example. Ben > On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: > > Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? > > We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. > > regards, > chris hunter > yale hpc group > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Mon Jun 29 17:07:12 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 29 Jun 2015 12:07:12 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: References: Message-ID: <55916D30.3050709@yale.edu> Thanks for the info. We settled on a simpler perl wrapper around sas2ircu form nagios exchange. chris hunter yale hpc group > Checkout https://github.com/leibler/check_mk-sas2ircu This is obviously check_mk specific, but a reasonable example. Ben >> On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: >> >> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? >> >> We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. >> >> regards, >> chris hunter >> yale hpc group From st.graf at fz-juelich.de Tue Jun 30 07:54:18 2015 From: st.graf at fz-juelich.de (Graf, Stephan) Date: Tue, 30 Jun 2015 06:54:18 +0000 Subject: [gpfsug-discuss] ESS/GSS GUI (Monitoring) Message-ID: <38A0607912A90F4880BDE29022E093054087CF1A@MBX2010-E01.ad.fz-juelich.de> Hi! If anyone is interested in a simple GUI for GSS/ESS we have one developed for our own (in the time when there was no GUI available). It is java based and the only requirement is to have passwordless access to the GSS nodes. (We start the GUI on our xCAT server). I have uploaded some screenshots: https://www.dropbox.com/sh/44kln4h7wgp18uu/AADsllhSxOdIeWtkNSaftu8Sa?dl=0 Stephan ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Vogel at abcsystems.ch Tue Jun 30 08:49:10 2015 From: Daniel.Vogel at abcsystems.ch (Daniel Vogel) Date: Tue, 30 Jun 2015 07:49:10 +0000 Subject: [gpfsug-discuss] GPFS 4.1.1 without QoS for mmrestripefs? Message-ID: <2CDF270206A255459AC4FA6B08E52AF90114634DD0@ABCSYSEXC1.abcsystems.ch> Hi Years ago, IBM made some plan to do a implementation "QoS for mmrestripefs, mmdeldisk...". If a "mmfsrestripe" is running, very poor performance for NFS access. I opened a PMR to ask for QoS in version 4.1.1 (Spectrum Scale). PMR 61309,113,848: I discussed the question of QOS with the development team. These command changes that were noticed are not meant to be used as GA code which is why they are not documented. I cannot provide any further information from the support perspective. Anybody knows about QoS? The last hope was at "GPFS Workshop Stuttgart M?rz 2015" with Sven Oehme as speaker. Daniel Vogel IT Consultant ABC SYSTEMS AG Hauptsitz Z?rich R?tistrasse 28 CH - 8952 Schlieren T +41 43 433 6 433 D +41 43 433 6 467 http://www.abcsystems.ch ABC - Always Better Concepts. Approved By Customers since 1981. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 8 10:06:56 2015 From: chair at gpfsug.org (Jez Tucker) Date: Mon, 08 Jun 2015 10:06:56 +0100 Subject: [gpfsug-discuss] Election news Message-ID: <55755B30.9070601@gpfsug.org> Hello all Simon Thompson (Research Computing, University of Birmingham) has put himself forward as sole candidate for position of Chair. I firmly believe it in the best interest of the Group that we do not have the same Chair indefinitely. The Group is on fine footing, so now is an appropriate time for change. Having spoken with Simon, the UG Committee are more than happy to recommend him for the position of Chair for the next two years. Over the same period, the UG Committee has proposed that I move to represent Media (as per sector representatives) and continue to support the efforts of the Group where appropriate. The UG Committee would also like to recommend Ross Keeping for the IBM non-exec position. Some of you will have met Ross at the recent User Group. He understands the focus and needs of the Group and will act as the group's plug-in to IBM as well as hosting the 'Meet the Devs' events (details on the next one soon). With respect to the above, we do not believe it is prudent to spend time and resource on election scaffolding to vote for a single candidate. We would suggest that if the majority of members are extremely against this move that it is discussed openly in the mailing list. Discussion is good! Failing any overwhelming response to the contrary, Simon will assume position of Chair on 19th June 2015 with the Committee?s full support. Best regards, Jez (Chair) and Claire (Secretary) -------- Simon's response to the Election call follows verbatim: a) The post they wish to stand for Group chair b) A paragraph covering their credentials I have been working with GPFS for the past few years, initially in an HPC environment and more recently using it to deliver our research data and OpenStack platforms. The research storage platform was developed in conjunction with OCF, our IBM business partner which spans both spinning disk and TSM HSM layer. I have spoken at both the UK GPFS user group and at the GPFS user forum in the USA. In addition to this I've made a short customer video used by IBM marketing. Linked in profile: uk.linkedin.com/in/simonjthompson1 Blog: www.roamingzebra.co.uk c) A paragraph covering what they would bring to the group I already have a good working relationship with GPFS developers having spent the past few months building our OpenStack platform working with IBM and documenting how to use some of the features, and would look to build on this relationship to develop the GPFS user group. I've also blogged many of the bits I have experimented with and would like to see this develop with the group contributing to a wiki style information source with specific examples of technology and configs. In addition to this, I have support from my employer to attend meetings and conferences and would be happy to represent and promote the group at these as well as bringing feedback. d) A paragraph setting out their vision for the group for the next two years I would like to see the group engaging with more diverse users of GPFS as many of those attending the meetings are from HPC type environments, so I would loom to work with both IBM and resellers to help engage with other industries using GPFS technology. Ideally this would see more customer talks at the user group in addition to a balanced view from IBM on road maps. I think it would also be good to focus on specific features of gpfs and how they work and can be applied as my suspicion is very few customers use lots of features to full advantage. Simon -------- ends -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Mon Jun 15 09:35:18 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Mon, 15 Jun 2015 08:35:18 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 15 14:24:57 2015 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Mon, 15 Jun 2015 14:24:57 +0100 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Message-ID: <557ED229.50902@gpfsug.org> Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) From ewahl at osc.edu Mon Jun 15 14:59:44 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 13:59:44 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Jun 15 15:10:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 15 Jun 2015 14:10:25 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Manilla is one of the projects to provide ?shared? access to file-systems. I thought that at the moment, Manilla doesn?t support the GPFS protocol but is implemented on top of Ganesha so it provided as NFS access. So you wouldn?t get mmunlinkfileset. This sorta brings me back to one of the things I talked about at the GPFS UG, as in the GPFS security model is trusting, which in multi-tenant environments is a bad thing. I know I?ve spoken to a few people recently who?ve commented / agreed / had thoughts on it, so can I ask that if multi-tenancy security is something that you think is of concern with GPFS, can you drop me an email (directly is fine) which your use case and what sort of thing you?d like to see, then I?ll collate this and have a go at talking to IBM again about this. Thanks Simon From: , Edward > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 14:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Mon Jun 15 15:16:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Mon, 15 Jun 2015 15:16:44 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <1434377805.15671.126.camel@buzzard.phy.strath.ac.uk> On Mon, 2015-06-15 at 08:35 +0000, Luke Raimbach wrote: > Dear All, > > We are looking forward to using the manila driver for > auto-provisioning of file shares using GPFS. However, I have some > concerns... > > > Manila presumably gives tenant users access to file system commands > like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset > quiesces the file system, there is potentially an impact from one > tenant on another - i.e. someone unlinking and deleting a lot of > filesets during a tenancy cleanup might cause a cluster pause long > enough to trigger other failure events or even start evicting nodes. > You can see why this would be bad in a cloud environment. Er as far as I can see in the documentation no you don't. My personal experience is mmunlinkfileset has a habit of locking the file system up; aka don't do while the file system is busy. On the other hand mmlinkfileset you can do with gay abandonment. Might have changed in more recent version of GPFS. On the other hand you do get access to creating/deleting snapshots which on the deleting side has in the past for me personally has caused file system lockups. Similarly creating a snapshot no problem. The difference between the two is things that require quiescence to take away from the file system can cause bad things happen. Quiescence to add things to the file system rarely if ever cause problems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From chris.hunter at yale.edu Mon Jun 15 15:35:06 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 15 Jun 2015 10:35:06 -0400 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: <557EE29A.4070909@yale.edu> Although likely not the access model you are seeking, GPFS is mentioned for the swift-on-file project: * https://github.com/stackforge/swiftonfile Openstack Swift uses HTTP/REST protocol for file access (ala S3), not the best choice for data-intensive applications. regards, chris hunter yale hpc group --- Date: Mon, 15 Jun 2015 08:35:18 +0000 From: Luke Raimbach To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE From ross.keeping at uk.ibm.com Mon Jun 15 17:43:07 2015 From: ross.keeping at uk.ibm.com (Ross Keeping3) Date: Mon, 15 Jun 2015 17:43:07 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location Message-ID: Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 360 bytes Desc: not available URL: From ewahl at osc.edu Mon Jun 15 21:35:04 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 20:35:04 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Mon Jun 15 21:38:39 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 15 Jun 2015 15:38:39 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It took me a while to find it too - Key is to search on "Spectrum Scale". Try this URL: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all If you don't want X86, just select the appropriate platform. Bob Oesterlin Nuance Communications On Mon, Jun 15, 2015 at 3:35 PM, Wahl, Edward wrote: > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Tue Jun 16 08:36:56 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 07:36:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jun 16 09:11:23 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:11:23 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: The docs also now seem to be in Spectrum Scale section at: http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html Simon From: Ross Keeping3 > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 17:43 To: "gpfsug-discuss at gpfsug.org" > Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone:(+44 161) 8362381-Line:37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From jonathan at buzzard.me.uk Tue Jun 16 09:40:30 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 09:40:30 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 07:36 +0000, Luke Raimbach wrote: [SNIP] > The tenants don?t have root access to the file system, but the Manila > component must act as a wrapper to file system administrative > equivalents like mmcrfileset, mmdelfileset, link and unlink. The > shares are created as GPFS filesets which are then presented over NFS. > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From adam.huffman at crick.ac.uk Tue Jun 16 09:41:56 2015 From: adam.huffman at crick.ac.uk (Adam Huffman) Date: Tue, 16 Jun 2015 08:41:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: The presentation of shared storage by Manila isn?t necessarily via NFS. Some of the drivers, I believe the GPFS one amongst them, allow some form of native connection either via the guest or via a VirtFS connection to the client on the hypervisor. Best Wishes, Adam ? > On 16 Jun 2015, at 08:36, Luke Raimbach wrote: > > So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. > > The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. > > The unlinking of the fileset worries me for the reasons stated previously. > > From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward > Sent: 15 June 2015 15:00 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver > > Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. > > I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? > > Ed Wahl > OSC > > > > ++ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] > Sent: Monday, June 15, 2015 4:35 AM > To: gpfsug-discuss at gpfsug.org > Subject: [gpfsug-discuss] OpenStack Manila Driver > > Dear All, > > We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... > > Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. > > > Has this scenario been addressed at all? > > Cheers, > Luke. > > > Luke Raimbach? > Senior HPC Data and Storage Systems Engineer > The Francis Crick Institute > Gibbs Building > 215 Euston Road > London NW1 2BE > > E: luke.raimbach at crick.ac.uk > W: www.crick.ac.uk > > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Tue Jun 16 09:46:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:46:52 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I didn;t think that the *current* Manilla driver user GPFS protocol, but sat on top of Ganesha server. Simon On 16/06/2015 09:41, "Adam Huffman" wrote: > >The presentation of shared storage by Manila isn?t necessarily via NFS. >Some of the drivers, I believe the GPFS one amongst them, allow some form >of native connection either via the guest or via a VirtFS connection to >the client on the hypervisor. > >Best Wishes, >Adam > > >? > > > > > >> On 16 Jun 2015, at 08:36, Luke Raimbach >>wrote: >> >> So as I understand things, Manila is an OpenStack component which >>allows tenants to create and destroy shares for their instances which >>would be accessed over NFS. Perhaps I?ve not done enough research in to >>this though ? I?m also not an OpenStack expert. >> >> The tenants don?t have root access to the file system, but the Manila >>component must act as a wrapper to file system administrative >>equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares >>are created as GPFS filesets which are then presented over NFS. >> >> The unlinking of the fileset worries me for the reasons stated >>previously. >> >> From: gpfsug-discuss-bounces at gpfsug.org >>[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward >> Sent: 15 June 2015 15:00 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver >> >> Perhaps I misunderstand here, but if the tenants have administrative >>(ie:root) privileges to the underlying file system management commands I >>think mmunlinkfileset might be a minor concern here. There are FAR more >>destructive things that could occur. >> >> I am not an OpenStack expert and I've not even looked at anything past >>Kilo, but my understanding was that these commands were not necessary >>for tenants. They access a virtual block device that backs to GPFS, >>correct? >> >> Ed Wahl >> OSC >> >> >> >> ++ >> From: gpfsug-discuss-bounces at gpfsug.org >>[gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach >>[Luke.Raimbach at crick.ac.uk] >> Sent: Monday, June 15, 2015 4:35 AM >> To: gpfsug-discuss at gpfsug.org >> Subject: [gpfsug-discuss] OpenStack Manila Driver >> >> Dear All, >> >> We are looking forward to using the manila driver for auto-provisioning >>of file shares using GPFS. However, I have some concerns... >> >> Manila presumably gives tenant users access to file system commands >>like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset >>quiesces the file system, there is potentially an impact from one tenant >>on another - i.e. someone unlinking and deleting a lot of filesets >>during a tenancy cleanup might cause a cluster pause long enough to >>trigger other failure events or even start evicting nodes. You can see >>why this would be bad in a cloud environment. >> >> >> Has this scenario been addressed at all? >> >> Cheers, >> Luke. >> >> >> Luke Raimbach? >> Senior HPC Data and Storage Systems Engineer >> The Francis Crick Institute >> Gibbs Building >> 215 Euston Road >> London NW1 2BE >> >> E: luke.raimbach at crick.ac.uk >> W: www.crick.ac.uk >> >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >The Francis Crick Institute Limited is a registered charity in England >and Wales no. 1140062 and a company registered in England and Wales no. >06885462, with its registered office at 215 Euston Road, London NW1 2BE. >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Luke.Raimbach at crick.ac.uk Tue Jun 16 09:48:50 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 08:48:50 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: [SNIP] >> The tenants don?t have root access to the file system, but the Manila >> component must act as a wrapper to file system administrative >> equivalents like mmcrfileset, mmdelfileset, link and unlink. The >> shares are created as GPFS filesets which are then presented over NFS. >> > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. I think you are right. Looking over the various resources I have available, the creation, deletion, linking and unlinking of filesets is not implemented, but commented on as needing to be done. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From jonathan at buzzard.me.uk Tue Jun 16 10:25:45 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 10:25:45 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: <1434446745.15671.134.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 08:48 +0000, Luke Raimbach wrote: [SNIP] > I think you are right. Looking over the various resources I have > available, the creation, deletion, linking and unlinking of filesets is > not implemented, but commented on as needing to be done. That's going to be a right barrel of laughs as reliability goes out the window if they do implement it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From billowen at us.ibm.com Thu Jun 18 05:22:25 2015 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 17 Jun 2015 22:22:25 -0600 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Hi Luke, Your explanation below is correct, with some minor clarifications Manila is an OpenStack project which allows storage admins to create and destroy filesystem shares and make those available to vm instances and bare metal servers which would be accessed over NFS. The Manila driver runs in the control plane and creates a new gpfs independent fileset for each new share. It provides automation for giving vm's (and also bare metal servers) acces to the shares so that they can mount and use the share. There is work being done to allow automating the mount process when the vm instance boots. The tenants don?t have root access to the file system, but the Manila component acts as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The manila driver uses the following gpfs commands: When a share is created: mmcrfileset mmlinkfileset mmsetquota When a share is deleted: mmunlinkfileset mmdelfileset Snapshots of shares can be created and deleted: mmcrsnapshot mmdelsnapshot Today, the GPFS Manila driver supports creating NFS exports to VMs. We are considering adding native GPFS client support in the VM, but not sure if the benefit justifies the extra complexity of having gpfs client in vm image, and also the impact to cluster as vm's come up and down in a more dynamic way than physical nodes. For multi-tenant deployments, we recommend using a different filesystem per tenant to provide better separation of data, and to minimize the "noisy neighbor" effect for operations like mmunlinkfileset. Here is a presentation that shows an overview of the GPFS Manila driver: (See attached file: OpenStack_Storage_Manila_with_GPFS.pdf) Perhaps this, and other GPFS & OpenStack topics could be the subject of a future user group session. Regards, Bill Owen billowen at us.ibm.com GPFS and OpenStack 520-799-4829 From: Luke Raimbach To: gpfsug main discussion list Date: 06/16/2015 12:37 AM Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Sent by: gpfsug-discuss-bounces at gpfsug.org So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenStack_Storage_Manila_with_GPFS.pdf Type: application/pdf Size: 354887 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 13:30:40 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 12:30:40 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Message-ID: Hi All, Something I am thinking about doing is utilising the placement policy engine to insert custom metadata tags upon file creation, based on which fileset the creation occurs in. This might be to facilitate Research Data Management tasks that could happen later in the data lifecycle. I am also thinking about allowing users to specify additional custom metadata tags (maybe through a fancy web interface) and also potentially give users control over creating new filesets (e.g. for scientists running new experiments). So? pretend this is a placement policy on my GPFS driven data-ingest platform: RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' The fileset name can be meaningless (as far as the user is concerned), but would be linked somewhere nice that they recognise ? say /gpfs/incoming/instrument1. The fileset, when it is created, would also be an AFM cache for its ?home? counterpart which exists on a much larger (also GPFS driven) pool of storage? so that my metadata tags are preserved, you see. This potentially user driven activity might look a bit like this: - User logs in to web interface and creates new experiment - Filesets (system-generated names) are created on ?home? and ?ingest? file systems and linked into the directory namespace wherever the user specifies - AFM relationships are set up and established for the ingest (cache) fileset to write back to the AFM home fileset (probably Independent Writer mode) - A set of ?default? policies are defined and installed on the cache file system to tag data for that experiment (the user can?t change these) - The user now specifies additional metadata tags they want added to their experiment data (some of this might be captured through additional mandatory fields in the web form for instance) - A policy for later execution by mmapplypolicy on the AFM home file system is created which looks for the tags generated at ingest-time and applies the extra user-defined tags There?s much more that would go on later in the lifecycle to take care of automated HSM tiering, data publishing, movement and cataloguing of data onto external non GPFS file systems, etc. but I won?t go in to it here. My GPFS related questions are: When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. What is the specific limitation for having a policy placement file no larger than 1MB? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 14:18:34 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 09:18:34 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Jun 18 14:27:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Thu, 18 Jun 2015 13:27:52 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: I can see exactly where Luke?s suggestion would be applicable. We might have several hundred active research projects which would have some sort of internal identifier, so I can see why you?d want to do this sort of tagging as it would allow a policy scan to find files related to specific projects (for example). Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 14:18 To: gpfsug main discussion list >, "luke.raimbach at crick.ac.uk" > Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 14:35:32 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 13:35:32 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. From: Marc A Kaplan [mailto:makaplan at us.ibm.com] Sent: 18 June 2015 14:19 To: gpfsug main discussion list; Luke Raimbach Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach > ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo_fumagalli at it.ibm.com Thu Jun 18 14:38:00 2015 From: massimo_fumagalli at it.ibm.com (Massimo Fumagalli) Date: Thu, 18 Jun 2015 15:38:00 +0200 Subject: [gpfsug-discuss] ILM question Message-ID: Please, I need to know a simple question. Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to library). Then we need to read a file that has been moved to library (or Tier1). Will be file copied back to Tier 0? Or read will be executed directly from Library or Tier1 ? since there can be performance issue Regards Max IBM Italia S.p.A. Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) Cap. Soc. euro 347.256.998,80 C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 Societ? con unico azionista Societ? soggetta all?attivit? di direzione e coordinamento di International Business Machines Corporation (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From ewahl at osc.edu Thu Jun 18 15:08:29 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 14:08:29 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) Ed Wahl OSC From Paul.Sanchez at deshaw.com Thu Jun 18 15:52:07 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Thu, 18 Jun 2015 14:52:07 +0000 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation In-Reply-To: <557ED229.50902@gpfsug.org> References: <557ED229.50902@gpfsug.org> Message-ID: <201D6001C896B846A9CFC2E841986AC1454124B2@mailnycmb2a.winmail.deshaw.com> Thanks Jez, D. E. Shaw is based in New York, NY. We have 3-4 engineers/architects who would attend. Additionally, if you haven't heard from D. E. Shaw Research, they're next-door and have another 2. -Paul Sanchez Sent with Good (www.good.com) ________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Jez Tucker (Chair) Sent: Monday, June 15, 2015 9:24:57 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 16:36:49 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 11:36:49 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: (1) There is no secret flag. I assume that the existing policy is okay but the new one is better. So start using the better one ASAP, but why stop the system if you don't have to? The not secret way to quiesce/resume a filesystem without unmounting is fsctl {suspend | suspend-write | resume}; (2) The policy rules text is passed as a string through a GPFS rpc protocol (not a standard RPC) and the designer/coder chose 1MB as a safety-limit. I think it could be increased, but suppose you did have 4000 rules, each 200 bytes - you'd be at 800KB, still short of the 1MB limit. (x) Personally, I wouldn't worry much about setting, say 10 extended attribute values in each rule. I'd worry more about the impact of having 100s of rules. (y) When designing/deploying a new GPFS filesystem, consider explicitly setting the inode size so that all anticipated extended attributes will be stored in the inode, rather than spilling into other disk blocks. See mmcrfs ... -i InodeSize. You can build a test filesystem with just one NSD/LUN and test your anticipated usage. Use tsdbfs ... xattr ... to see how EAs are stored. Caution: tsdbfs display commands are harmless, BUT there are some patch and patch-like subcommands that could foul up your filesystem. From: Luke Raimbach Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Thu Jun 18 17:02:54 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:02:54 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Sorry to hear about the problems you've had recently. It's frustrating when that happens. I didn't have the exactly the same situation, but we had something similar which may bring some light to a missing disks situation: We had a dataOnly storage pool backed by a few building blocks each consisting of several RAID controllers that were each direct attached to a few servers. We had several of these sets all in one pool. Thus, if a server failed it was fine, if a single link failed, it was fine. Potentially we could do copies=2 and have multiple failure groups in a single pool. If anything in the RAID arrays themselves failed, it was OK, but a single whole RAID controller going down would take that section of disks down. The number of copies was set to 1 on this pool. One RAID controller went down, but the file system as a whole stayed online. Our user experience was that Some users got IO errors of a "file inaccessible" type (I don't remember the exact code). Other users, and especially those mostly in other tiers continued to work as normal. As we had mostly small files across this tier ( much smaller than the GPFS block size ), most of the files were in one of the RAID controllers or another, thus not striping really, so even the files in other controllers on the same tier were also fine and accessible. Bottom line is: Only the files that were missing gave errors, the others were fine. Additionally, for missing files errors were reported which apps could capture and do something about, wait, or retry later -- not a D state process waiting forever or stale file handles. I'm not saying this is the best way. We didn't intend for this to happen. I suspect that stopping the disk would result in a similar experience but more safely. We asked GPFS devs if we needed to fsck after this since the tier just went offline directly and we continued to use the rest of the system while it was gone.. they said no it should be fine and missing blocks will be taken care of. I assume this is true, but I have no explicit proof, except that it's still working and nothing seemed to be missing. I guess some questions for the dev's would be: * Is this safe / advisable to do the above either directly or via a stop and then down the array? * Given that there is some client-side write caching in GPFS, if a file is being written and an expected final destination goes offline mid-write, where does the block go? + If a whole pool goes offline, will it pick another pool or error? + If it's a disk in a pool, will it reevaluate and round-robin to the next disk, or just fail since it had already decided where to write? Hope this helps a little. On Thu, Jun 18, 2015 at 10:08 AM, Wahl, Edward wrote: > We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? > > In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. > > I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. > > I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) > > Ed Wahl > OSC > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From zgiles at gmail.com Thu Jun 18 17:06:33 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:06:33 -0400 Subject: [gpfsug-discuss] ILM question In-Reply-To: References: Message-ID: I would expect it to return to one of your online tiers. If you tier between two storage pools, you can directly read and write those files. Think of how LTFS works -- it's an external storage pool, so you need to run an operation via an external command to give the file back to GPFS from which you can read it. This is controlled via the policies and I assume you would need to make a policy to specify where the file would be placed when it comes back. It would be fancy for someone to allow reading directly from an external pool, but as far as I know, it has to hit a disk first. What I don't know is: Will it begin streaming the files back to the user as the blocks hit the disk, while other blocks are still coming in, or must the whole file be recalled first? On Thu, Jun 18, 2015 at 9:38 AM, Massimo Fumagalli < massimo_fumagalli at it.ibm.com> wrote: > Please, I need to know a simple question. > > Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating > files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to > library). > Then we need to read a file that has been moved to library (or Tier1). > Will be file copied back to Tier 0? Or read will be executed directly from > Library or Tier1 ? since there can be performance issue > > Regards > Max > > > IBM Italia S.p.A. > Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) > Cap. Soc. euro 347.256.998,80 > C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 > Societ? con unico azionista > Societ? soggetta all?attivit? di direzione e coordinamento di > International Business Machines Corporation > > (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- Zach Giles zgiles at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From chekh at stanford.edu Thu Jun 18 21:26:17 2015 From: chekh at stanford.edu (Alex Chekholko) Date: Thu, 18 Jun 2015 13:26:17 -0700 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <55832969.4050901@stanford.edu> mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu From ewahl at osc.edu Thu Jun 18 21:36:48 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 20:36:48 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <55832969.4050901@stanford.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <55832969.4050901@stanford.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58A6A@CIO-KRC-D1MBX02.osuad.osu.edu> I'm not sure it's so uncommon, but yes. (and your line looks suspiciously like mine did) I've had other situations where it would have been nice to do maintenance on a single storage pool. Maybe this is a "scale" issue when you get too large and should maybe have multiple file systems instead? Single name space is nice for users though. Plus I was curious what others had done in similar situations. I guess I could do what IBM does and just write the stupid script, name it "ts-something" and put a happy wrapper up front with a mm-something name. ;) Just FYI: 'suspend' does NOT stop I/O. Only stops new block creation,so 'stop' was what I did. >From the man page: "...Existing data on a suspended disk may still be read or updated." Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Alex Chekholko [chekh at stanford.edu] Sent: Thursday, June 18, 2015 4:26 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Thu Jun 18 22:01:01 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 17:01:01 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: From stijn.deweirdt at ugent.be Fri Jun 19 08:18:31 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 09:18:31 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <5583C247.1090609@ugent.be> > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. just this week we had an issue with bad disk (of the non-failing but disrupting everything kind) and issues with the raid controller (db of both controllers corrupted due to the one disk, controller reboot loops etc etc). but tech support pulled it through, although it took a while. i'm amased what can be done with the hardware controllers (and i've seen my share of recoveries ;) my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). stijn > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Luke.Raimbach at crick.ac.uk Fri Jun 19 08:47:10 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Fri, 19 Jun 2015 07:47:10 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <5583C247.1090609@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <5583C247.1090609@ugent.be> Message-ID: my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). Ooh, we have a new one that's not in production yet. IBM say the latest GSS code should allow for a whole enclosure failure. I might try it before going in to production. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Fri Jun 19 14:31:17 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:31:17 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Do you mean in GNR compared to using (san/IB) based hardware RAIDs? If so, then GNR isn?t a scale-out solution - you buy a ?unit? and can add another ?unit? to the namespace, but I can?t add another 30TB of storage (say a researcher with a grant), where as with SAN based RAID controllers, I can go off and buy another storage shelf. Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 22:01 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Jun 19 14:51:32 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 09:51:32 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: 1. YES, Native Raid can recover from various failures: drawers, cabling, controllers, power supplies, etc, etc. Of course it must be configured properly so that there is no possible single point of failure. But yes, you should get your hands on a test rig and try out (simulate) various failure scenarios and see how well it works. 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. 3. If you'd like to know and/or explore more, read the pubs, do the experiments, and/or contact the IBM sales and support people. IF by some chance you do not get satisfactory answers, come back here perhaps we can get your inquiries addressed by the GPFS design team. Like other complex products, there are bound to be some questions that the sales and marketing people can't quite address. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Jun 19 14:56:33 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:56:33 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Jun 19 15:37:13 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 19 Jun 2015 14:37:13 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Fri Jun 19 15:49:32 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 14:49:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <201506191450.t5JEovUS018695@d01av01.pok.ibm.com> GNR today is only sold as a packaged solution e.g. ESS. The reason its not sold as SW only today is technical and its not true that this is not been pursued, its just not there yet and we cant discuss plans on a mailinglist. Sven Sent from IBM Verse Wahl, Edward --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Wahl, Edward" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:41 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan Reply-To: gpfsug main discussion list Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Fri Jun 19 15:56:19 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 10:56:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com From jtucker at pixitmedia.com Fri Jun 19 16:05:24 2015 From: jtucker at pixitmedia.com (Jez Tucker (Chair)) Date: Fri, 19 Jun 2015 16:05:24 +0100 Subject: [gpfsug-discuss] Handing over chair@ Message-ID: <55842FB4.9030705@gpfsug.org> Hello all This is my last post as Chair for the foreseeable future. The next will come from Simon Thompson who assumes the post today for the next two years. I'm looking forward to Simon's tenure and wish him all the best with his endeavours. Myself, I'm moving over to UG Media Rep and will continue to support the User Group and committee in its efforts. My new email is jez.tucker at gpfsug.org Please keep sending through your City and Country locations, they're most helpful. Have a great weekend. All the best, Jez -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oehmes at us.ibm.com Fri Jun 19 16:09:41 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 15:09:41 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: Message-ID: <201506191510.t5JFAdW2021605@d03av04.boulder.ibm.com> Reporting is not the issue, one of the main issue is that we can't talk to the enclosure, which results in loosing the capability to replace disk drive or turn any fault indicators on. It also prevents us to 'read' the position of a drive within a tray or fault domain within a enclosure, without that information we can't properly determine where we need to place strips of a track to prevent data access loss in case a enclosure or component fails. Sven Sent from IBM Verse Zachary Giles --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Zachary Giles" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:56 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Fri Jun 19 16:15:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:15:44 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: > I think it's technically possible to run GNR on unsupported trays. You > may have to do some fiddling with some of the scripts, and/or you wont > get proper reporting. > Of course it probably violates 100 licenses etc etc etc. > I don't know of anyone who's done it yet. I'd like to do it.. I think > it would be great to learn it deeper by doing this. > One imagines that GNR uses the SCSI enclosure services to talk to the shelves. https://en.wikipedia.org/wiki/SCSI_Enclosure_Services https://en.wikipedia.org/wiki/SES-2_Enclosure_Management Which would suggest that anything that supported these would work. I did some experimentation with a spare EXP810 shelf a few years ago on a FC-AL on Linux. Kind all worked out the box. The other experiment with an EXP100 didn't work so well; with the EXP100 it would only work with the 250GB and 400GB drives that came with the dam thing. With the EXP810 I could screw random SATA drives into it and it all worked. My investigations concluded that the firmware on the EXP100 shelf determined if the drive was supported, but I could not work out how to upload modified firmware to the shelf. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From stijn.deweirdt at ugent.be Fri Jun 19 16:23:18 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 17:23:18 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <558433E6.8030708@ugent.be> hi marc, > 1. YES, Native Raid can recover from various failures: drawers, cabling, > controllers, power supplies, etc, etc. > Of course it must be configured properly so that there is no possible > single point of failure. hmmm, this is not really what i was asking about. but maybe it's easier in gss to do this properly (eg for 8+3 data protection, you only need 11 drawers if you can make sure the data+parity blocks are send to different drawers (sort of per drawer failure group, but internal to the vdisks), and the smallest setup is a gss24 which has 20 drawers). but i can't rememeber any manual suggestion the admin can control this (or is it the default?). anyway, i'm certainly interested in any config whitepapers or guides to see what is required for such setup. are these public somewhere? (have really searched for them). > > But yes, you should get your hands on a test rig and try out (simulate) > various failure scenarios and see how well it works. is there a way besides presales to get access to such setup? stijn > > 2. I don't know the details of the packaged products, but I believe you > can license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire > or need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > 3. If you'd like to know and/or explore more, read the pubs, do the > experiments, and/or contact the IBM sales and support people. > IF by some chance you do not get satisfactory answers, come back here > perhaps we can get your inquiries addressed by the > GPFS design team. Like other complex products, there are bound to be some > questions that the sales and marketing people > can't quite address. > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jonathan at buzzard.me.uk Fri Jun 19 16:35:32 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:35:32 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <558433E6.8030708@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> Message-ID: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 17:23 +0200, Stijn De Weirdt wrote: > hi marc, > > > > 1. YES, Native Raid can recover from various failures: drawers, cabling, > > controllers, power supplies, etc, etc. > > Of course it must be configured properly so that there is no possible > > single point of failure. > hmmm, this is not really what i was asking about. but maybe it's easier > in gss to do this properly (eg for 8+3 data protection, you only need 11 > drawers if you can make sure the data+parity blocks are send to > different drawers (sort of per drawer failure group, but internal to the > vdisks), and the smallest setup is a gss24 which has 20 drawers). > but i can't rememeber any manual suggestion the admin can control this > (or is it the default?). > I got the impression that GNR was more in line with the Engenio dynamic disk pools http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf That is traditional RAID sucks with large numbers of big drives. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From bsallen at alcf.anl.gov Fri Jun 19 17:05:15 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 16:05:15 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. Ben > On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: > > On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >> I think it's technically possible to run GNR on unsupported trays. You >> may have to do some fiddling with some of the scripts, and/or you wont >> get proper reporting. >> Of course it probably violates 100 licenses etc etc etc. >> I don't know of anyone who's done it yet. I'd like to do it.. I think >> it would be great to learn it deeper by doing this. >> > > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. > > https://en.wikipedia.org/wiki/SCSI_Enclosure_Services > https://en.wikipedia.org/wiki/SES-2_Enclosure_Management > > Which would suggest that anything that supported these would work. > > I did some experimentation with a spare EXP810 shelf a few years ago on > a FC-AL on Linux. Kind all worked out the box. The other experiment with > an EXP100 didn't work so well; with the EXP100 it would only work with > the 250GB and 400GB drives that came with the dam thing. With the EXP810 > I could screw random SATA drives into it and it all worked. My > investigations concluded that the firmware on the EXP100 shelf > determined if the drive was supported, but I could not work out how to > upload modified firmware to the shelf. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Fri Jun 19 17:09:44 2015 From: peserocka at gmail.com (Pete Sero) Date: Sat, 20 Jun 2015 00:09:44 +0800 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: vi my_enclosures.conf fwiw Peter On 2015 Jun 20 Sat, at 24:05, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 17:12:53 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 12:12:53 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: Ya, that's why I mentioned you'd probably have to fiddle with some scripts or something to help GNR figure out where disks are. Is definitely known that you can't just use any random enclosure given that GNR depends highly on the topology. Maybe in the future there would be a way to specify the topology or that a drive is at a specific position. On Fri, Jun 19, 2015 at 12:05 PM, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From makaplan at us.ibm.com Fri Jun 19 19:45:19 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 14:45:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stijn.deweirdt at ugent.be Fri Jun 19 20:01:04 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 21:01:04 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> Message-ID: <558466F0.8000300@ugent.be> >>> 1. YES, Native Raid can recover from various failures: drawers, cabling, >>> controllers, power supplies, etc, etc. >>> Of course it must be configured properly so that there is no possible >>> single point of failure. >> hmmm, this is not really what i was asking about. but maybe it's easier >> in gss to do this properly (eg for 8+3 data protection, you only need 11 >> drawers if you can make sure the data+parity blocks are send to >> different drawers (sort of per drawer failure group, but internal to the >> vdisks), and the smallest setup is a gss24 which has 20 drawers). >> but i can't rememeber any manual suggestion the admin can control this >> (or is it the default?). >> > > I got the impression that GNR was more in line with the Engenio dynamic > disk pools well, it's uses some crush-like placement and some parity encoding scheme (regular raid6 for the DDP, some flavour of EC for GNR), but other then that, not much resemblence. DDP does not give you any control over where the data blocks are stored. i'm not sure about GNR, (but DDP does not state anywhere they are drawer failure proof ;). but GNR is more like a DDP then e.g. a ceph EC pool, in the sense that the hosts needs to see all disks (similar to the controller that needs access to the disks). > > http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx > > http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf > > That is traditional RAID sucks with large numbers of big drives. (btw it's one of those that we saw fail (and get recovered by tech support!) this week. tip of the week: turn on the SMmonitor service on at least one host, it's actually useful for something). stijn > > > JAB. > From S.J.Thompson at bham.ac.uk Fri Jun 19 20:17:32 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 19:17:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>, Message-ID: My understanding I that GSS and IBM ESS are sold as pre configured systems. So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] Sent: 19 June 2015 19:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? From zgiles at gmail.com Fri Jun 19 21:08:14 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 16:08:14 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From S.J.Thompson at bham.ac.uk Fri Jun 19 22:08:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 21:08:25 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: I'm not disputing that gnr is a cool technology. Just that as scale out, it doesn't work for our funding model. If we go back to the original question, if was pros and cons of gnr vs raid type storage. My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] Sent: 19 June 2015 21:08 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Fri Jun 19 22:18:51 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:18:51 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: References: Message-ID: <5584873B.1080109@yale.edu> Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group From zgiles at gmail.com Fri Jun 19 22:35:59 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 17:35:59 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OK, back on topic: Honestly, I'm really glad you said that. I have that exact problem also -- a researcher will be funded for xTB of space, and we are told by the grants office that if something is purchased on a grant it belongs to them and it should have a sticker put on it that says "property of the govt' etc etc. We decided to (as an institution) put the money forward to purchase a large system ahead of time, and as grants come in, recover the cost back into the system by paying off our internal "negative balance". In this way we can get the benefit of a large storage system like performance and purchasing price, but provision storage into quotas as needed. We can even put stickers on a handful of drives in the GSS tray if that makes them feel happy. Could they request us to hand over their drives and take them out of our system? Maybe. if the Grants Office made us do it, sure, I'd drain some pools off and go hand them over.. but that will never happen because it's more valuable to them in our cluster than sitting on their table, and I'm not going to deliver the drives full of their data. That's their responsibility. Is it working? Yeah, but, I'm not a grants admin nor an accountant, so I'll let them figure that out, and they seem to be OK with this model. And yes, it's not going to work for all institutions unless you can put the money forward upfront, or do a group purchase at the end of a year. So I 100% agree, GNR doesn't really fit the model of purchasing a few drives at a time, and the grants things is still a problem. On Fri, Jun 19, 2015 at 5:08 PM, Simon Thompson (Research Computing - IT Services) wrote: > I'm not disputing that gnr is a cool technology. > > Just that as scale out, it doesn't work for our funding model. > > If we go back to the original question, if was pros and cons of gnr vs raid type storage. > > My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. > > And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] > Sent: 19 June 2015 21:08 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chris.hunter at yale.edu Fri Jun 19 22:57:14 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:57:14 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: References: Message-ID: <5584903A.3020203@yale.edu> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. A JBOD solution that allows incremental drive expansion is desirable. chris hunter yale hpc group > From: Zachary Giles > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > OK, back on topic: > Honestly, I'm really glad you said that. I have that exact problem > also -- a researcher will be funded for xTB of space, and we are told > by the grants office that if something is purchased on a grant it > belongs to them and it should have a sticker put on it that says > "property of the govt' etc etc. > We decided to (as an institution) put the money forward to purchase a > large system ahead of time, and as grants come in, recover the cost > back into the system by paying off our internal "negative balance". In > this way we can get the benefit of a large storage system like > performance and purchasing price, but provision storage into quotas as > needed. We can even put stickers on a handful of drives in the GSS > tray if that makes them feel happy. > Could they request us to hand over their drives and take them out of > our system? Maybe. if the Grants Office made us do it, sure, I'd drain > some pools off and go hand them over.. but that will never happen > because it's more valuable to them in our cluster than sitting on > their table, and I'm not going to deliver the drives full of their > data. That's their responsibility. > > Is it working? Yeah, but, I'm not a grants admin nor an accountant, so > I'll let them figure that out, and they seem to be OK with this model. > And yes, it's not going to work for all institutions unless you can > put the money forward upfront, or do a group purchase at the end of a > year. > > So I 100% agree, GNR doesn't really fit the model of purchasing a few > drives at a time, and the grants things is still a problem. From jhick at lbl.gov Fri Jun 19 23:18:56 2015 From: jhick at lbl.gov (Jason Hick) Date: Fri, 19 Jun 2015 15:18:56 -0700 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <5584903A.3020203@yale.edu> References: <5584903A.3020203@yale.edu> Message-ID: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. As opposed to dealing with racks of storage and architectural details. Jason > On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: > > I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. > > I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. > > We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. > > A JBOD solution that allows incremental drive expansion is desirable. > > chris hunter > yale hpc group > >> From: Zachary Giles >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >> themselves? How about GPFS Native Raid? >> >> OK, back on topic: >> Honestly, I'm really glad you said that. I have that exact problem >> also -- a researcher will be funded for xTB of space, and we are told >> by the grants office that if something is purchased on a grant it >> belongs to them and it should have a sticker put on it that says >> "property of the govt' etc etc. >> We decided to (as an institution) put the money forward to purchase a >> large system ahead of time, and as grants come in, recover the cost >> back into the system by paying off our internal "negative balance". In >> this way we can get the benefit of a large storage system like >> performance and purchasing price, but provision storage into quotas as >> needed. We can even put stickers on a handful of drives in the GSS >> tray if that makes them feel happy. >> Could they request us to hand over their drives and take them out of >> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >> some pools off and go hand them over.. but that will never happen >> because it's more valuable to them in our cluster than sitting on >> their table, and I'm not going to deliver the drives full of their >> data. That's their responsibility. >> >> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >> I'll let them figure that out, and they seem to be OK with this model. >> And yes, it's not going to work for all institutions unless you can >> put the money forward upfront, or do a group purchase at the end of a >> year. >> >> So I 100% agree, GNR doesn't really fit the model of purchasing a few >> drives at a time, and the grants things is still a problem. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 23:54:39 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 18:54:39 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> References: <5584903A.3020203@yale.edu> <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> Message-ID: Starting to sound like Seagate/Xyratex there. :) On Fri, Jun 19, 2015 at 6:18 PM, Jason Hick wrote: > For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. > > As opposed to dealing with racks of storage and architectural details. > > Jason > >> On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: >> >> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. >> >> I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. >> >> We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. >> >> A JBOD solution that allows incremental drive expansion is desirable. >> >> chris hunter >> yale hpc group >> >>> From: Zachary Giles >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >>> themselves? How about GPFS Native Raid? >>> >>> OK, back on topic: >>> Honestly, I'm really glad you said that. I have that exact problem >>> also -- a researcher will be funded for xTB of space, and we are told >>> by the grants office that if something is purchased on a grant it >>> belongs to them and it should have a sticker put on it that says >>> "property of the govt' etc etc. >>> We decided to (as an institution) put the money forward to purchase a >>> large system ahead of time, and as grants come in, recover the cost >>> back into the system by paying off our internal "negative balance". In >>> this way we can get the benefit of a large storage system like >>> performance and purchasing price, but provision storage into quotas as >>> needed. We can even put stickers on a handful of drives in the GSS >>> tray if that makes them feel happy. >>> Could they request us to hand over their drives and take them out of >>> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >>> some pools off and go hand them over.. but that will never happen >>> because it's more valuable to them in our cluster than sitting on >>> their table, and I'm not going to deliver the drives full of their >>> data. That's their responsibility. >>> >>> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >>> I'll let them figure that out, and they seem to be OK with this model. >>> And yes, it's not going to work for all institutions unless you can >>> put the money forward upfront, or do a group purchase at the end of a >>> year. >>> >>> So I 100% agree, GNR doesn't really fit the model of purchasing a few >>> drives at a time, and the grants things is still a problem. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From bsallen at alcf.anl.gov Sat Jun 20 00:12:53 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 23:12:53 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: <5584873B.1080109@yale.edu> References: , <5584873B.1080109@yale.edu> Message-ID: <3a261dc3-e8a4-4550-bab2-db4cc0ffbaea@alcf.anl.gov> Let me know what specific questions you have. Ben From: Chris Hunter Sent: Jun 19, 2015 4:18 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From viccornell at gmail.com Sat Jun 20 22:12:53 2015 From: viccornell at gmail.com (Vic Cornell) Date: Sat, 20 Jun 2015 22:12:53 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Just to make sure everybody is up to date on this, (I work for DDN BTW): > On 19 Jun 2015, at 21:08, Zachary Giles wrote: > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. With the 12K you can buy 1,2,3,4,5,,10 or 20. With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. Happy to expand on any of this on or offline. Vic > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Sat Jun 20 23:40:58 2015 From: zgiles at gmail.com (Zachary Giles) Date: Sat, 20 Jun 2015 18:40:58 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Message-ID: All true. I wasn't trying to knock DDN or say "it can't be done", it's just (probably) not very efficient or cost effective to buy a 12K with 30 drives (as an example). The new 7700 looks like a really nice base a small building block. I had forgot about them. There is a good box for adding 4U at a time, and with 60 drives per enclosure, if you saturated it out at ~3 enclosure / 180 drives, you'd have 1PB, which is also a nice round building block size. :thumb up: On Sat, Jun 20, 2015 at 5:12 PM, Vic Cornell wrote: > Just to make sure everybody is up to date on this, (I work for DDN BTW): > >> On 19 Jun 2015, at 21:08, Zachary Giles wrote: >> >> It's comparable to other "large" controller systems. Take the DDN >> 10K/12K for example: You don't just buy one more shelf of disks, or 5 >> disks at a time from Walmart. You buy 5, 10, or 20 trays and populate >> enough disks to either hit your bandwidth or storage size requirement. > > With the 12K you can buy 1,2,3,4,5,,10 or 20. > > With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. > > GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > >> Generally changing from 5 to 10 to 20 requires support to come on-site >> and recable it, and generally you either buy half or all the disks >> slots worth of disks. > > You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > >> The whole system is a building block and you buy >> N of them to get up to 10-20PB of storage. >> GSS is the same way, there are a few models and you just buy a packaged one. >> >> Technically, you can violate the above constraints, but then it may >> not work well and you probably can't buy it that way. >> I'm pretty sure DDN's going to look at you funny if you try to buy a >> 12K with 30 drives.. :) > > Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. > > Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. > > Happy to expand on any of this on or offline. > > Vic > > >> >> For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save >> money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with >> buildin RAID, a pair of servers, and forget GNR. >> Or maybe GSS22? :) >> >> From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 >> " >> Current high-density storage Models 24 and 26 remain available >> Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u >> JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) >> 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available >> 200 GB and 800 GB SSDs are also available >> The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, >> 26s is comprised of SSD drives or 1.2 TB hard SAS drives >> " >> >> >> On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - >> IT Services) wrote: >>> >>> My understanding I that GSS and IBM ESS are sold as pre configured systems. >>> >>> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >>> >>> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >>> >>> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >>> >>> Simon >>> ________________________________________ >>> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >>> Sent: 19 June 2015 19:45 >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >>> >>> OOps... here is the official statement: >>> >>> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >>> >>> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> -- >> Zach Giles >> zgiles at gmail.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chair at gpfsug.org Mon Jun 22 08:57:49 2015 From: chair at gpfsug.org (GPFS UG Chair) Date: Mon, 22 Jun 2015 08:57:49 +0100 Subject: [gpfsug-discuss] chair@GPFS UG Message-ID: Hi all, Just to follow up from Jez's email last week I'm now taking over as chair of the group. I'd like to thank Jez for his work with the group over the past couple of years in developing it to where it is now (as well as Claire who is staying on as secretary!). We're still interested in sector reps for the group, so if you are a GPFS user in a specific sector and would be interested in this, please let me know. As there haven't really been any sector reps before, we'll see how that works out, but I can't see it being a lot of work! On the US side of things, I need to catch up with Jez and Claire to see where things are up to. And finally, just as a quick head's up, we're pencilled in to have a user group mini (2hr) meeting in the UK in December as one of the breakout groups at the annual MEW event, once the dates for this are published I'll send out a save the date. If you are a user and interested in speaking, also let me know as well as anything else you might like to see there. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Jun 22 14:04:23 2015 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 22 Jun 2015 13:04:23 +0000 Subject: [gpfsug-discuss] Placement Policy Installation andRDMConsiderations In-Reply-To: References: , Message-ID: <201506221305.t5MD5Owv014072@d01av05.pok.ibm.com> An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:28:22 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:28:22 +0100 Subject: [gpfsug-discuss] LROC Express Message-ID: <55882996.6050903@pixitmedia.com> Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oester at gmail.com Mon Jun 22 16:36:08 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:36:08 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882996.6050903@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> Message-ID: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: > Hi All, > > Very quick question for those in the know - does LROC require a standard > license, or will it work with Express? I can't find anything in the FAQ > regarding this so I presume Express is ok, but wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:39:49 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:39:49 +0100 Subject: [gpfsug-discuss] LROC Express In-Reply-To: References: <55882996.6050903@pixitmedia.com> Message-ID: <55882C45.6090501@pixitmedia.com> Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: > It works with Standard edition, just make sure you have the right > license for the nodes using LROC. > > Bob Oesterlin > Nuance COmmunications > > > Bob Oesterlin > > > On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: > > Hi All, > > Very quick question for those in the know - does LROC require a > standard license, or will it work with Express? I can't find > anything in the FAQ regarding this so I presume Express is ok, but > wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the > exclusive attention of the addressee(s) indicated. If you are not > the intended recipient, this email should not be read or disclosed > to any other person. Please notify the sender immediately and > delete this email from your computer system. Any opinions > expressed are not necessarily those of the company from which this > email was sent and, whilst to the best of our knowledge no viruses > or defects exist, no responsibility can be accepted for any loss > or damage arising from its receipt or subsequent use of this email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oester at gmail.com Mon Jun 22 16:45:33 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:45:33 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: I only have a Standard Edition, so I can't say for sure. I do know it's Linux x86 only. This doesn't seem to say directly either: http://www-01.ibm.com/support/knowledgecenter/SSFKCN/gpfs4104/gpfsclustersfaq.html%23lic41?lang=en Bob Oesterlin On Mon, Jun 22, 2015 at 10:39 AM, Barry Evans wrote: > Hi Bob, > > Thanks for this, just to confirm does this mean that it *does not* work > with express? > > Cheers, > Barry > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Mon Jun 22 23:57:10 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Mon, 22 Jun 2015 22:57:10 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [http://www.pixitmedia.com/sig/sig-cio.jpg] This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Tue Jun 23 00:14:09 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Mon, 22 Jun 2015 16:14:09 -0700 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> Message-ID: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ From: "Sanchez, Paul" To: gpfsug main discussion list Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Paul.Sanchez at deshaw.com Tue Jun 23 15:10:31 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Tue, 23 Jun 2015 14:10:31 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418BC0@mailnycmb2a.winmail.deshaw.com> Hi Sven, Yes, I think that fileset level include/exclude would be sufficient for us. It also begs the question about the same for write caching. We haven?t experimented with it yet, but are looking forward to employing HAWC for scratch-like workloads. Do you imagine providing the same sort of HAWC bypass include/exclude to be part of this? That might be useful for excluding datasets where the write ingest rate isn?t massive and the degree of risk we?re comfortable with potential data recovery issues in the face of complex outages may be much lower. Thanks, Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Sven Oehme Sent: Monday, June 22, 2015 7:14 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ [Inactive hide details for "Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we]"Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we?re also running standard. But as a simple t From: "Sanchez, Paul" > To: gpfsug main discussion list > Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org ________________________________ I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From ewahl at osc.edu Tue Jun 23 15:11:11 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 23 Jun 2015 14:11:11 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: , <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> FYI this page causes problems with various versions of Chrome and Firefox (too lazy to test other browsers, sorry) Seems to be a javascript issue. Huge surprise, right? I've filed bugs on the browser sides for FF, don't care about chrome sorry. Ed Wahl OSC ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ewahl at osc.edu] Sent: Monday, June 15, 2015 4:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 4.1.1 fix central location When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Tue Jun 23 15:16:10 2015 From: oester at gmail.com (Bob Oesterlin) Date: Tue, 23 Jun 2015 09:16:10 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Try here: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all Bob Oesterlin On Tue, Jun 23, 2015 at 9:11 AM, Wahl, Edward wrote: > FYI this page causes problems with various versions of Chrome and > Firefox (too lazy to test other browsers, sorry) Seems to be a javascript > issue. Huge surprise, right? > > I've filed bugs on the browser sides for FF, don't care about chrome > sorry. > > Ed Wahl > OSC > > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ > ewahl at osc.edu] > *Sent:* Monday, June 15, 2015 4:35 PM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] 4.1.1 fix central location > > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From orlando.richards at ed.ac.uk Wed Jun 24 12:27:25 2015 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Wed, 24 Jun 2015 12:27:25 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <558A941D.2010206@ed.ac.uk> Hi all, I'm looking to deploy to RedHat 7.1, but from the GPFS FAQ only versions 4.1.1 and 3.5.0-26 are supported. I can't see a release of 3.5.0-26 on the fix central website - does anyone know if this is available? Will 3.5.0-25 work okay on RH7.1? How about 4.1.0-x - any plans to support that on RH7.1? ------- Orlando. On 16/06/15 09:11, Simon Thompson (Research Computing - IT Services) wrote: > The docs also now seem to be in Spectrum Scale section at: > > http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html > > Simon > > From: Ross Keeping3 > > Reply-To: gpfsug main discussion list > > Date: Monday, 15 June 2015 17:43 > To: "gpfsug-discuss at gpfsug.org " > > > Subject: [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central > you will likely be disappointed. Work is ongoing to ensure this becomes > more intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab > *Phone:*(+44 161) 8362381*-Line:*37642381* > E-mail: *ross.keeping at uk.ibm.com > IBM > > 3rd Floor, Maybrook House > > Manchester, M3 2EG > > United Kingdom > > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Services Manager Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From chris.hunter at yale.edu Wed Jun 24 18:26:11 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Wed, 24 Jun 2015 13:26:11 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Message-ID: <558AE833.6070803@yale.edu> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group From ewahl at osc.edu Wed Jun 24 18:47:19 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Wed, 24 Jun 2015 17:47:19 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5A174@CIO-KRC-D1MBX02.osuad.osu.edu> Both are available to you directly. in Linux anyway. My AIX knowledge is decades old. And yes, the HBAs have much more availability/data of course. What kind of monitoring are you looking to do? Fault? Take the data and ?? nagios/cactii/ganglia/etc? Mine it with Splunk? Expand the GPFS Monitor suite? sourceforge.net/projects/gpfsmonitorsuite (though with sourceforge lately, perhaps we should ask Pam et al. to move them?) Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Chris Hunter [chris.hunter at yale.edu] Sent: Wednesday, June 24, 2015 1:26 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bsallen at alcf.anl.gov Wed Jun 24 18:48:47 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Wed, 24 Jun 2015 17:48:47 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <64F35432-4DFC-452C-8965-455BCF7E2F09@alcf.anl.gov> Checkout https://github.com/leibler/check_mk-sas2ircu. This is obviously check_mk specific, but a reasonable example. Ben > On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: > > Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? > > We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. > > regards, > chris hunter > yale hpc group > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Mon Jun 29 17:07:12 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 29 Jun 2015 12:07:12 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: References: Message-ID: <55916D30.3050709@yale.edu> Thanks for the info. We settled on a simpler perl wrapper around sas2ircu form nagios exchange. chris hunter yale hpc group > Checkout https://github.com/leibler/check_mk-sas2ircu This is obviously check_mk specific, but a reasonable example. Ben >> On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: >> >> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? >> >> We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. >> >> regards, >> chris hunter >> yale hpc group From st.graf at fz-juelich.de Tue Jun 30 07:54:18 2015 From: st.graf at fz-juelich.de (Graf, Stephan) Date: Tue, 30 Jun 2015 06:54:18 +0000 Subject: [gpfsug-discuss] ESS/GSS GUI (Monitoring) Message-ID: <38A0607912A90F4880BDE29022E093054087CF1A@MBX2010-E01.ad.fz-juelich.de> Hi! If anyone is interested in a simple GUI for GSS/ESS we have one developed for our own (in the time when there was no GUI available). It is java based and the only requirement is to have passwordless access to the GSS nodes. (We start the GUI on our xCAT server). I have uploaded some screenshots: https://www.dropbox.com/sh/44kln4h7wgp18uu/AADsllhSxOdIeWtkNSaftu8Sa?dl=0 Stephan ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Vogel at abcsystems.ch Tue Jun 30 08:49:10 2015 From: Daniel.Vogel at abcsystems.ch (Daniel Vogel) Date: Tue, 30 Jun 2015 07:49:10 +0000 Subject: [gpfsug-discuss] GPFS 4.1.1 without QoS for mmrestripefs? Message-ID: <2CDF270206A255459AC4FA6B08E52AF90114634DD0@ABCSYSEXC1.abcsystems.ch> Hi Years ago, IBM made some plan to do a implementation "QoS for mmrestripefs, mmdeldisk...". If a "mmfsrestripe" is running, very poor performance for NFS access. I opened a PMR to ask for QoS in version 4.1.1 (Spectrum Scale). PMR 61309,113,848: I discussed the question of QOS with the development team. These command changes that were noticed are not meant to be used as GA code which is why they are not documented. I cannot provide any further information from the support perspective. Anybody knows about QoS? The last hope was at "GPFS Workshop Stuttgart M?rz 2015" with Sven Oehme as speaker. Daniel Vogel IT Consultant ABC SYSTEMS AG Hauptsitz Z?rich R?tistrasse 28 CH - 8952 Schlieren T +41 43 433 6 433 D +41 43 433 6 467 http://www.abcsystems.ch ABC - Always Better Concepts. Approved By Customers since 1981. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 8 10:06:56 2015 From: chair at gpfsug.org (Jez Tucker) Date: Mon, 08 Jun 2015 10:06:56 +0100 Subject: [gpfsug-discuss] Election news Message-ID: <55755B30.9070601@gpfsug.org> Hello all Simon Thompson (Research Computing, University of Birmingham) has put himself forward as sole candidate for position of Chair. I firmly believe it in the best interest of the Group that we do not have the same Chair indefinitely. The Group is on fine footing, so now is an appropriate time for change. Having spoken with Simon, the UG Committee are more than happy to recommend him for the position of Chair for the next two years. Over the same period, the UG Committee has proposed that I move to represent Media (as per sector representatives) and continue to support the efforts of the Group where appropriate. The UG Committee would also like to recommend Ross Keeping for the IBM non-exec position. Some of you will have met Ross at the recent User Group. He understands the focus and needs of the Group and will act as the group's plug-in to IBM as well as hosting the 'Meet the Devs' events (details on the next one soon). With respect to the above, we do not believe it is prudent to spend time and resource on election scaffolding to vote for a single candidate. We would suggest that if the majority of members are extremely against this move that it is discussed openly in the mailing list. Discussion is good! Failing any overwhelming response to the contrary, Simon will assume position of Chair on 19th June 2015 with the Committee?s full support. Best regards, Jez (Chair) and Claire (Secretary) -------- Simon's response to the Election call follows verbatim: a) The post they wish to stand for Group chair b) A paragraph covering their credentials I have been working with GPFS for the past few years, initially in an HPC environment and more recently using it to deliver our research data and OpenStack platforms. The research storage platform was developed in conjunction with OCF, our IBM business partner which spans both spinning disk and TSM HSM layer. I have spoken at both the UK GPFS user group and at the GPFS user forum in the USA. In addition to this I've made a short customer video used by IBM marketing. Linked in profile: uk.linkedin.com/in/simonjthompson1 Blog: www.roamingzebra.co.uk c) A paragraph covering what they would bring to the group I already have a good working relationship with GPFS developers having spent the past few months building our OpenStack platform working with IBM and documenting how to use some of the features, and would look to build on this relationship to develop the GPFS user group. I've also blogged many of the bits I have experimented with and would like to see this develop with the group contributing to a wiki style information source with specific examples of technology and configs. In addition to this, I have support from my employer to attend meetings and conferences and would be happy to represent and promote the group at these as well as bringing feedback. d) A paragraph setting out their vision for the group for the next two years I would like to see the group engaging with more diverse users of GPFS as many of those attending the meetings are from HPC type environments, so I would loom to work with both IBM and resellers to help engage with other industries using GPFS technology. Ideally this would see more customer talks at the user group in addition to a balanced view from IBM on road maps. I think it would also be good to focus on specific features of gpfs and how they work and can be applied as my suspicion is very few customers use lots of features to full advantage. Simon -------- ends -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Mon Jun 15 09:35:18 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Mon, 15 Jun 2015 08:35:18 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 15 14:24:57 2015 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Mon, 15 Jun 2015 14:24:57 +0100 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Message-ID: <557ED229.50902@gpfsug.org> Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) From ewahl at osc.edu Mon Jun 15 14:59:44 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 13:59:44 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Jun 15 15:10:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 15 Jun 2015 14:10:25 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Manilla is one of the projects to provide ?shared? access to file-systems. I thought that at the moment, Manilla doesn?t support the GPFS protocol but is implemented on top of Ganesha so it provided as NFS access. So you wouldn?t get mmunlinkfileset. This sorta brings me back to one of the things I talked about at the GPFS UG, as in the GPFS security model is trusting, which in multi-tenant environments is a bad thing. I know I?ve spoken to a few people recently who?ve commented / agreed / had thoughts on it, so can I ask that if multi-tenancy security is something that you think is of concern with GPFS, can you drop me an email (directly is fine) which your use case and what sort of thing you?d like to see, then I?ll collate this and have a go at talking to IBM again about this. Thanks Simon From: , Edward > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 14:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Mon Jun 15 15:16:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Mon, 15 Jun 2015 15:16:44 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <1434377805.15671.126.camel@buzzard.phy.strath.ac.uk> On Mon, 2015-06-15 at 08:35 +0000, Luke Raimbach wrote: > Dear All, > > We are looking forward to using the manila driver for > auto-provisioning of file shares using GPFS. However, I have some > concerns... > > > Manila presumably gives tenant users access to file system commands > like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset > quiesces the file system, there is potentially an impact from one > tenant on another - i.e. someone unlinking and deleting a lot of > filesets during a tenancy cleanup might cause a cluster pause long > enough to trigger other failure events or even start evicting nodes. > You can see why this would be bad in a cloud environment. Er as far as I can see in the documentation no you don't. My personal experience is mmunlinkfileset has a habit of locking the file system up; aka don't do while the file system is busy. On the other hand mmlinkfileset you can do with gay abandonment. Might have changed in more recent version of GPFS. On the other hand you do get access to creating/deleting snapshots which on the deleting side has in the past for me personally has caused file system lockups. Similarly creating a snapshot no problem. The difference between the two is things that require quiescence to take away from the file system can cause bad things happen. Quiescence to add things to the file system rarely if ever cause problems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From chris.hunter at yale.edu Mon Jun 15 15:35:06 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 15 Jun 2015 10:35:06 -0400 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: <557EE29A.4070909@yale.edu> Although likely not the access model you are seeking, GPFS is mentioned for the swift-on-file project: * https://github.com/stackforge/swiftonfile Openstack Swift uses HTTP/REST protocol for file access (ala S3), not the best choice for data-intensive applications. regards, chris hunter yale hpc group --- Date: Mon, 15 Jun 2015 08:35:18 +0000 From: Luke Raimbach To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE From ross.keeping at uk.ibm.com Mon Jun 15 17:43:07 2015 From: ross.keeping at uk.ibm.com (Ross Keeping3) Date: Mon, 15 Jun 2015 17:43:07 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location Message-ID: Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 360 bytes Desc: not available URL: From ewahl at osc.edu Mon Jun 15 21:35:04 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 20:35:04 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Mon Jun 15 21:38:39 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 15 Jun 2015 15:38:39 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It took me a while to find it too - Key is to search on "Spectrum Scale". Try this URL: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all If you don't want X86, just select the appropriate platform. Bob Oesterlin Nuance Communications On Mon, Jun 15, 2015 at 3:35 PM, Wahl, Edward wrote: > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Tue Jun 16 08:36:56 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 07:36:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jun 16 09:11:23 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:11:23 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: The docs also now seem to be in Spectrum Scale section at: http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html Simon From: Ross Keeping3 > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 17:43 To: "gpfsug-discuss at gpfsug.org" > Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone:(+44 161) 8362381-Line:37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From jonathan at buzzard.me.uk Tue Jun 16 09:40:30 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 09:40:30 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 07:36 +0000, Luke Raimbach wrote: [SNIP] > The tenants don?t have root access to the file system, but the Manila > component must act as a wrapper to file system administrative > equivalents like mmcrfileset, mmdelfileset, link and unlink. The > shares are created as GPFS filesets which are then presented over NFS. > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From adam.huffman at crick.ac.uk Tue Jun 16 09:41:56 2015 From: adam.huffman at crick.ac.uk (Adam Huffman) Date: Tue, 16 Jun 2015 08:41:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: The presentation of shared storage by Manila isn?t necessarily via NFS. Some of the drivers, I believe the GPFS one amongst them, allow some form of native connection either via the guest or via a VirtFS connection to the client on the hypervisor. Best Wishes, Adam ? > On 16 Jun 2015, at 08:36, Luke Raimbach wrote: > > So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. > > The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. > > The unlinking of the fileset worries me for the reasons stated previously. > > From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward > Sent: 15 June 2015 15:00 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver > > Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. > > I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? > > Ed Wahl > OSC > > > > ++ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] > Sent: Monday, June 15, 2015 4:35 AM > To: gpfsug-discuss at gpfsug.org > Subject: [gpfsug-discuss] OpenStack Manila Driver > > Dear All, > > We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... > > Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. > > > Has this scenario been addressed at all? > > Cheers, > Luke. > > > Luke Raimbach? > Senior HPC Data and Storage Systems Engineer > The Francis Crick Institute > Gibbs Building > 215 Euston Road > London NW1 2BE > > E: luke.raimbach at crick.ac.uk > W: www.crick.ac.uk > > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Tue Jun 16 09:46:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:46:52 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I didn;t think that the *current* Manilla driver user GPFS protocol, but sat on top of Ganesha server. Simon On 16/06/2015 09:41, "Adam Huffman" wrote: > >The presentation of shared storage by Manila isn?t necessarily via NFS. >Some of the drivers, I believe the GPFS one amongst them, allow some form >of native connection either via the guest or via a VirtFS connection to >the client on the hypervisor. > >Best Wishes, >Adam > > >? > > > > > >> On 16 Jun 2015, at 08:36, Luke Raimbach >>wrote: >> >> So as I understand things, Manila is an OpenStack component which >>allows tenants to create and destroy shares for their instances which >>would be accessed over NFS. Perhaps I?ve not done enough research in to >>this though ? I?m also not an OpenStack expert. >> >> The tenants don?t have root access to the file system, but the Manila >>component must act as a wrapper to file system administrative >>equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares >>are created as GPFS filesets which are then presented over NFS. >> >> The unlinking of the fileset worries me for the reasons stated >>previously. >> >> From: gpfsug-discuss-bounces at gpfsug.org >>[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward >> Sent: 15 June 2015 15:00 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver >> >> Perhaps I misunderstand here, but if the tenants have administrative >>(ie:root) privileges to the underlying file system management commands I >>think mmunlinkfileset might be a minor concern here. There are FAR more >>destructive things that could occur. >> >> I am not an OpenStack expert and I've not even looked at anything past >>Kilo, but my understanding was that these commands were not necessary >>for tenants. They access a virtual block device that backs to GPFS, >>correct? >> >> Ed Wahl >> OSC >> >> >> >> ++ >> From: gpfsug-discuss-bounces at gpfsug.org >>[gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach >>[Luke.Raimbach at crick.ac.uk] >> Sent: Monday, June 15, 2015 4:35 AM >> To: gpfsug-discuss at gpfsug.org >> Subject: [gpfsug-discuss] OpenStack Manila Driver >> >> Dear All, >> >> We are looking forward to using the manila driver for auto-provisioning >>of file shares using GPFS. However, I have some concerns... >> >> Manila presumably gives tenant users access to file system commands >>like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset >>quiesces the file system, there is potentially an impact from one tenant >>on another - i.e. someone unlinking and deleting a lot of filesets >>during a tenancy cleanup might cause a cluster pause long enough to >>trigger other failure events or even start evicting nodes. You can see >>why this would be bad in a cloud environment. >> >> >> Has this scenario been addressed at all? >> >> Cheers, >> Luke. >> >> >> Luke Raimbach? >> Senior HPC Data and Storage Systems Engineer >> The Francis Crick Institute >> Gibbs Building >> 215 Euston Road >> London NW1 2BE >> >> E: luke.raimbach at crick.ac.uk >> W: www.crick.ac.uk >> >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >The Francis Crick Institute Limited is a registered charity in England >and Wales no. 1140062 and a company registered in England and Wales no. >06885462, with its registered office at 215 Euston Road, London NW1 2BE. >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Luke.Raimbach at crick.ac.uk Tue Jun 16 09:48:50 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 08:48:50 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: [SNIP] >> The tenants don?t have root access to the file system, but the Manila >> component must act as a wrapper to file system administrative >> equivalents like mmcrfileset, mmdelfileset, link and unlink. The >> shares are created as GPFS filesets which are then presented over NFS. >> > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. I think you are right. Looking over the various resources I have available, the creation, deletion, linking and unlinking of filesets is not implemented, but commented on as needing to be done. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From jonathan at buzzard.me.uk Tue Jun 16 10:25:45 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 10:25:45 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: <1434446745.15671.134.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 08:48 +0000, Luke Raimbach wrote: [SNIP] > I think you are right. Looking over the various resources I have > available, the creation, deletion, linking and unlinking of filesets is > not implemented, but commented on as needing to be done. That's going to be a right barrel of laughs as reliability goes out the window if they do implement it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From billowen at us.ibm.com Thu Jun 18 05:22:25 2015 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 17 Jun 2015 22:22:25 -0600 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Hi Luke, Your explanation below is correct, with some minor clarifications Manila is an OpenStack project which allows storage admins to create and destroy filesystem shares and make those available to vm instances and bare metal servers which would be accessed over NFS. The Manila driver runs in the control plane and creates a new gpfs independent fileset for each new share. It provides automation for giving vm's (and also bare metal servers) acces to the shares so that they can mount and use the share. There is work being done to allow automating the mount process when the vm instance boots. The tenants don?t have root access to the file system, but the Manila component acts as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The manila driver uses the following gpfs commands: When a share is created: mmcrfileset mmlinkfileset mmsetquota When a share is deleted: mmunlinkfileset mmdelfileset Snapshots of shares can be created and deleted: mmcrsnapshot mmdelsnapshot Today, the GPFS Manila driver supports creating NFS exports to VMs. We are considering adding native GPFS client support in the VM, but not sure if the benefit justifies the extra complexity of having gpfs client in vm image, and also the impact to cluster as vm's come up and down in a more dynamic way than physical nodes. For multi-tenant deployments, we recommend using a different filesystem per tenant to provide better separation of data, and to minimize the "noisy neighbor" effect for operations like mmunlinkfileset. Here is a presentation that shows an overview of the GPFS Manila driver: (See attached file: OpenStack_Storage_Manila_with_GPFS.pdf) Perhaps this, and other GPFS & OpenStack topics could be the subject of a future user group session. Regards, Bill Owen billowen at us.ibm.com GPFS and OpenStack 520-799-4829 From: Luke Raimbach To: gpfsug main discussion list Date: 06/16/2015 12:37 AM Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Sent by: gpfsug-discuss-bounces at gpfsug.org So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenStack_Storage_Manila_with_GPFS.pdf Type: application/pdf Size: 354887 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 13:30:40 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 12:30:40 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Message-ID: Hi All, Something I am thinking about doing is utilising the placement policy engine to insert custom metadata tags upon file creation, based on which fileset the creation occurs in. This might be to facilitate Research Data Management tasks that could happen later in the data lifecycle. I am also thinking about allowing users to specify additional custom metadata tags (maybe through a fancy web interface) and also potentially give users control over creating new filesets (e.g. for scientists running new experiments). So? pretend this is a placement policy on my GPFS driven data-ingest platform: RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' The fileset name can be meaningless (as far as the user is concerned), but would be linked somewhere nice that they recognise ? say /gpfs/incoming/instrument1. The fileset, when it is created, would also be an AFM cache for its ?home? counterpart which exists on a much larger (also GPFS driven) pool of storage? so that my metadata tags are preserved, you see. This potentially user driven activity might look a bit like this: - User logs in to web interface and creates new experiment - Filesets (system-generated names) are created on ?home? and ?ingest? file systems and linked into the directory namespace wherever the user specifies - AFM relationships are set up and established for the ingest (cache) fileset to write back to the AFM home fileset (probably Independent Writer mode) - A set of ?default? policies are defined and installed on the cache file system to tag data for that experiment (the user can?t change these) - The user now specifies additional metadata tags they want added to their experiment data (some of this might be captured through additional mandatory fields in the web form for instance) - A policy for later execution by mmapplypolicy on the AFM home file system is created which looks for the tags generated at ingest-time and applies the extra user-defined tags There?s much more that would go on later in the lifecycle to take care of automated HSM tiering, data publishing, movement and cataloguing of data onto external non GPFS file systems, etc. but I won?t go in to it here. My GPFS related questions are: When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. What is the specific limitation for having a policy placement file no larger than 1MB? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 14:18:34 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 09:18:34 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Jun 18 14:27:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Thu, 18 Jun 2015 13:27:52 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: I can see exactly where Luke?s suggestion would be applicable. We might have several hundred active research projects which would have some sort of internal identifier, so I can see why you?d want to do this sort of tagging as it would allow a policy scan to find files related to specific projects (for example). Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 14:18 To: gpfsug main discussion list >, "luke.raimbach at crick.ac.uk" > Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 14:35:32 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 13:35:32 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. From: Marc A Kaplan [mailto:makaplan at us.ibm.com] Sent: 18 June 2015 14:19 To: gpfsug main discussion list; Luke Raimbach Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach > ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo_fumagalli at it.ibm.com Thu Jun 18 14:38:00 2015 From: massimo_fumagalli at it.ibm.com (Massimo Fumagalli) Date: Thu, 18 Jun 2015 15:38:00 +0200 Subject: [gpfsug-discuss] ILM question Message-ID: Please, I need to know a simple question. Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to library). Then we need to read a file that has been moved to library (or Tier1). Will be file copied back to Tier 0? Or read will be executed directly from Library or Tier1 ? since there can be performance issue Regards Max IBM Italia S.p.A. Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) Cap. Soc. euro 347.256.998,80 C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 Societ? con unico azionista Societ? soggetta all?attivit? di direzione e coordinamento di International Business Machines Corporation (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From ewahl at osc.edu Thu Jun 18 15:08:29 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 14:08:29 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) Ed Wahl OSC From Paul.Sanchez at deshaw.com Thu Jun 18 15:52:07 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Thu, 18 Jun 2015 14:52:07 +0000 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation In-Reply-To: <557ED229.50902@gpfsug.org> References: <557ED229.50902@gpfsug.org> Message-ID: <201D6001C896B846A9CFC2E841986AC1454124B2@mailnycmb2a.winmail.deshaw.com> Thanks Jez, D. E. Shaw is based in New York, NY. We have 3-4 engineers/architects who would attend. Additionally, if you haven't heard from D. E. Shaw Research, they're next-door and have another 2. -Paul Sanchez Sent with Good (www.good.com) ________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Jez Tucker (Chair) Sent: Monday, June 15, 2015 9:24:57 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 16:36:49 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 11:36:49 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: (1) There is no secret flag. I assume that the existing policy is okay but the new one is better. So start using the better one ASAP, but why stop the system if you don't have to? The not secret way to quiesce/resume a filesystem without unmounting is fsctl {suspend | suspend-write | resume}; (2) The policy rules text is passed as a string through a GPFS rpc protocol (not a standard RPC) and the designer/coder chose 1MB as a safety-limit. I think it could be increased, but suppose you did have 4000 rules, each 200 bytes - you'd be at 800KB, still short of the 1MB limit. (x) Personally, I wouldn't worry much about setting, say 10 extended attribute values in each rule. I'd worry more about the impact of having 100s of rules. (y) When designing/deploying a new GPFS filesystem, consider explicitly setting the inode size so that all anticipated extended attributes will be stored in the inode, rather than spilling into other disk blocks. See mmcrfs ... -i InodeSize. You can build a test filesystem with just one NSD/LUN and test your anticipated usage. Use tsdbfs ... xattr ... to see how EAs are stored. Caution: tsdbfs display commands are harmless, BUT there are some patch and patch-like subcommands that could foul up your filesystem. From: Luke Raimbach Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Thu Jun 18 17:02:54 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:02:54 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Sorry to hear about the problems you've had recently. It's frustrating when that happens. I didn't have the exactly the same situation, but we had something similar which may bring some light to a missing disks situation: We had a dataOnly storage pool backed by a few building blocks each consisting of several RAID controllers that were each direct attached to a few servers. We had several of these sets all in one pool. Thus, if a server failed it was fine, if a single link failed, it was fine. Potentially we could do copies=2 and have multiple failure groups in a single pool. If anything in the RAID arrays themselves failed, it was OK, but a single whole RAID controller going down would take that section of disks down. The number of copies was set to 1 on this pool. One RAID controller went down, but the file system as a whole stayed online. Our user experience was that Some users got IO errors of a "file inaccessible" type (I don't remember the exact code). Other users, and especially those mostly in other tiers continued to work as normal. As we had mostly small files across this tier ( much smaller than the GPFS block size ), most of the files were in one of the RAID controllers or another, thus not striping really, so even the files in other controllers on the same tier were also fine and accessible. Bottom line is: Only the files that were missing gave errors, the others were fine. Additionally, for missing files errors were reported which apps could capture and do something about, wait, or retry later -- not a D state process waiting forever or stale file handles. I'm not saying this is the best way. We didn't intend for this to happen. I suspect that stopping the disk would result in a similar experience but more safely. We asked GPFS devs if we needed to fsck after this since the tier just went offline directly and we continued to use the rest of the system while it was gone.. they said no it should be fine and missing blocks will be taken care of. I assume this is true, but I have no explicit proof, except that it's still working and nothing seemed to be missing. I guess some questions for the dev's would be: * Is this safe / advisable to do the above either directly or via a stop and then down the array? * Given that there is some client-side write caching in GPFS, if a file is being written and an expected final destination goes offline mid-write, where does the block go? + If a whole pool goes offline, will it pick another pool or error? + If it's a disk in a pool, will it reevaluate and round-robin to the next disk, or just fail since it had already decided where to write? Hope this helps a little. On Thu, Jun 18, 2015 at 10:08 AM, Wahl, Edward wrote: > We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? > > In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. > > I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. > > I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) > > Ed Wahl > OSC > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From zgiles at gmail.com Thu Jun 18 17:06:33 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:06:33 -0400 Subject: [gpfsug-discuss] ILM question In-Reply-To: References: Message-ID: I would expect it to return to one of your online tiers. If you tier between two storage pools, you can directly read and write those files. Think of how LTFS works -- it's an external storage pool, so you need to run an operation via an external command to give the file back to GPFS from which you can read it. This is controlled via the policies and I assume you would need to make a policy to specify where the file would be placed when it comes back. It would be fancy for someone to allow reading directly from an external pool, but as far as I know, it has to hit a disk first. What I don't know is: Will it begin streaming the files back to the user as the blocks hit the disk, while other blocks are still coming in, or must the whole file be recalled first? On Thu, Jun 18, 2015 at 9:38 AM, Massimo Fumagalli < massimo_fumagalli at it.ibm.com> wrote: > Please, I need to know a simple question. > > Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating > files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to > library). > Then we need to read a file that has been moved to library (or Tier1). > Will be file copied back to Tier 0? Or read will be executed directly from > Library or Tier1 ? since there can be performance issue > > Regards > Max > > > IBM Italia S.p.A. > Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) > Cap. Soc. euro 347.256.998,80 > C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 > Societ? con unico azionista > Societ? soggetta all?attivit? di direzione e coordinamento di > International Business Machines Corporation > > (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- Zach Giles zgiles at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From chekh at stanford.edu Thu Jun 18 21:26:17 2015 From: chekh at stanford.edu (Alex Chekholko) Date: Thu, 18 Jun 2015 13:26:17 -0700 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <55832969.4050901@stanford.edu> mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu From ewahl at osc.edu Thu Jun 18 21:36:48 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 20:36:48 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <55832969.4050901@stanford.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <55832969.4050901@stanford.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58A6A@CIO-KRC-D1MBX02.osuad.osu.edu> I'm not sure it's so uncommon, but yes. (and your line looks suspiciously like mine did) I've had other situations where it would have been nice to do maintenance on a single storage pool. Maybe this is a "scale" issue when you get too large and should maybe have multiple file systems instead? Single name space is nice for users though. Plus I was curious what others had done in similar situations. I guess I could do what IBM does and just write the stupid script, name it "ts-something" and put a happy wrapper up front with a mm-something name. ;) Just FYI: 'suspend' does NOT stop I/O. Only stops new block creation,so 'stop' was what I did. >From the man page: "...Existing data on a suspended disk may still be read or updated." Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Alex Chekholko [chekh at stanford.edu] Sent: Thursday, June 18, 2015 4:26 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Thu Jun 18 22:01:01 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 17:01:01 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: From stijn.deweirdt at ugent.be Fri Jun 19 08:18:31 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 09:18:31 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <5583C247.1090609@ugent.be> > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. just this week we had an issue with bad disk (of the non-failing but disrupting everything kind) and issues with the raid controller (db of both controllers corrupted due to the one disk, controller reboot loops etc etc). but tech support pulled it through, although it took a while. i'm amased what can be done with the hardware controllers (and i've seen my share of recoveries ;) my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). stijn > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Luke.Raimbach at crick.ac.uk Fri Jun 19 08:47:10 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Fri, 19 Jun 2015 07:47:10 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <5583C247.1090609@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <5583C247.1090609@ugent.be> Message-ID: my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). Ooh, we have a new one that's not in production yet. IBM say the latest GSS code should allow for a whole enclosure failure. I might try it before going in to production. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Fri Jun 19 14:31:17 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:31:17 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Do you mean in GNR compared to using (san/IB) based hardware RAIDs? If so, then GNR isn?t a scale-out solution - you buy a ?unit? and can add another ?unit? to the namespace, but I can?t add another 30TB of storage (say a researcher with a grant), where as with SAN based RAID controllers, I can go off and buy another storage shelf. Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 22:01 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Jun 19 14:51:32 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 09:51:32 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: 1. YES, Native Raid can recover from various failures: drawers, cabling, controllers, power supplies, etc, etc. Of course it must be configured properly so that there is no possible single point of failure. But yes, you should get your hands on a test rig and try out (simulate) various failure scenarios and see how well it works. 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. 3. If you'd like to know and/or explore more, read the pubs, do the experiments, and/or contact the IBM sales and support people. IF by some chance you do not get satisfactory answers, come back here perhaps we can get your inquiries addressed by the GPFS design team. Like other complex products, there are bound to be some questions that the sales and marketing people can't quite address. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Jun 19 14:56:33 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:56:33 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Jun 19 15:37:13 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 19 Jun 2015 14:37:13 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Fri Jun 19 15:49:32 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 14:49:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <201506191450.t5JEovUS018695@d01av01.pok.ibm.com> GNR today is only sold as a packaged solution e.g. ESS. The reason its not sold as SW only today is technical and its not true that this is not been pursued, its just not there yet and we cant discuss plans on a mailinglist. Sven Sent from IBM Verse Wahl, Edward --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Wahl, Edward" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:41 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan Reply-To: gpfsug main discussion list Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Fri Jun 19 15:56:19 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 10:56:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com From jtucker at pixitmedia.com Fri Jun 19 16:05:24 2015 From: jtucker at pixitmedia.com (Jez Tucker (Chair)) Date: Fri, 19 Jun 2015 16:05:24 +0100 Subject: [gpfsug-discuss] Handing over chair@ Message-ID: <55842FB4.9030705@gpfsug.org> Hello all This is my last post as Chair for the foreseeable future. The next will come from Simon Thompson who assumes the post today for the next two years. I'm looking forward to Simon's tenure and wish him all the best with his endeavours. Myself, I'm moving over to UG Media Rep and will continue to support the User Group and committee in its efforts. My new email is jez.tucker at gpfsug.org Please keep sending through your City and Country locations, they're most helpful. Have a great weekend. All the best, Jez -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oehmes at us.ibm.com Fri Jun 19 16:09:41 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 15:09:41 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: Message-ID: <201506191510.t5JFAdW2021605@d03av04.boulder.ibm.com> Reporting is not the issue, one of the main issue is that we can't talk to the enclosure, which results in loosing the capability to replace disk drive or turn any fault indicators on. It also prevents us to 'read' the position of a drive within a tray or fault domain within a enclosure, without that information we can't properly determine where we need to place strips of a track to prevent data access loss in case a enclosure or component fails. Sven Sent from IBM Verse Zachary Giles --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Zachary Giles" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:56 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Fri Jun 19 16:15:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:15:44 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: > I think it's technically possible to run GNR on unsupported trays. You > may have to do some fiddling with some of the scripts, and/or you wont > get proper reporting. > Of course it probably violates 100 licenses etc etc etc. > I don't know of anyone who's done it yet. I'd like to do it.. I think > it would be great to learn it deeper by doing this. > One imagines that GNR uses the SCSI enclosure services to talk to the shelves. https://en.wikipedia.org/wiki/SCSI_Enclosure_Services https://en.wikipedia.org/wiki/SES-2_Enclosure_Management Which would suggest that anything that supported these would work. I did some experimentation with a spare EXP810 shelf a few years ago on a FC-AL on Linux. Kind all worked out the box. The other experiment with an EXP100 didn't work so well; with the EXP100 it would only work with the 250GB and 400GB drives that came with the dam thing. With the EXP810 I could screw random SATA drives into it and it all worked. My investigations concluded that the firmware on the EXP100 shelf determined if the drive was supported, but I could not work out how to upload modified firmware to the shelf. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From stijn.deweirdt at ugent.be Fri Jun 19 16:23:18 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 17:23:18 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <558433E6.8030708@ugent.be> hi marc, > 1. YES, Native Raid can recover from various failures: drawers, cabling, > controllers, power supplies, etc, etc. > Of course it must be configured properly so that there is no possible > single point of failure. hmmm, this is not really what i was asking about. but maybe it's easier in gss to do this properly (eg for 8+3 data protection, you only need 11 drawers if you can make sure the data+parity blocks are send to different drawers (sort of per drawer failure group, but internal to the vdisks), and the smallest setup is a gss24 which has 20 drawers). but i can't rememeber any manual suggestion the admin can control this (or is it the default?). anyway, i'm certainly interested in any config whitepapers or guides to see what is required for such setup. are these public somewhere? (have really searched for them). > > But yes, you should get your hands on a test rig and try out (simulate) > various failure scenarios and see how well it works. is there a way besides presales to get access to such setup? stijn > > 2. I don't know the details of the packaged products, but I believe you > can license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire > or need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > 3. If you'd like to know and/or explore more, read the pubs, do the > experiments, and/or contact the IBM sales and support people. > IF by some chance you do not get satisfactory answers, come back here > perhaps we can get your inquiries addressed by the > GPFS design team. Like other complex products, there are bound to be some > questions that the sales and marketing people > can't quite address. > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jonathan at buzzard.me.uk Fri Jun 19 16:35:32 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:35:32 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <558433E6.8030708@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> Message-ID: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 17:23 +0200, Stijn De Weirdt wrote: > hi marc, > > > > 1. YES, Native Raid can recover from various failures: drawers, cabling, > > controllers, power supplies, etc, etc. > > Of course it must be configured properly so that there is no possible > > single point of failure. > hmmm, this is not really what i was asking about. but maybe it's easier > in gss to do this properly (eg for 8+3 data protection, you only need 11 > drawers if you can make sure the data+parity blocks are send to > different drawers (sort of per drawer failure group, but internal to the > vdisks), and the smallest setup is a gss24 which has 20 drawers). > but i can't rememeber any manual suggestion the admin can control this > (or is it the default?). > I got the impression that GNR was more in line with the Engenio dynamic disk pools http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf That is traditional RAID sucks with large numbers of big drives. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From bsallen at alcf.anl.gov Fri Jun 19 17:05:15 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 16:05:15 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. Ben > On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: > > On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >> I think it's technically possible to run GNR on unsupported trays. You >> may have to do some fiddling with some of the scripts, and/or you wont >> get proper reporting. >> Of course it probably violates 100 licenses etc etc etc. >> I don't know of anyone who's done it yet. I'd like to do it.. I think >> it would be great to learn it deeper by doing this. >> > > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. > > https://en.wikipedia.org/wiki/SCSI_Enclosure_Services > https://en.wikipedia.org/wiki/SES-2_Enclosure_Management > > Which would suggest that anything that supported these would work. > > I did some experimentation with a spare EXP810 shelf a few years ago on > a FC-AL on Linux. Kind all worked out the box. The other experiment with > an EXP100 didn't work so well; with the EXP100 it would only work with > the 250GB and 400GB drives that came with the dam thing. With the EXP810 > I could screw random SATA drives into it and it all worked. My > investigations concluded that the firmware on the EXP100 shelf > determined if the drive was supported, but I could not work out how to > upload modified firmware to the shelf. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Fri Jun 19 17:09:44 2015 From: peserocka at gmail.com (Pete Sero) Date: Sat, 20 Jun 2015 00:09:44 +0800 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: vi my_enclosures.conf fwiw Peter On 2015 Jun 20 Sat, at 24:05, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 17:12:53 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 12:12:53 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: Ya, that's why I mentioned you'd probably have to fiddle with some scripts or something to help GNR figure out where disks are. Is definitely known that you can't just use any random enclosure given that GNR depends highly on the topology. Maybe in the future there would be a way to specify the topology or that a drive is at a specific position. On Fri, Jun 19, 2015 at 12:05 PM, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From makaplan at us.ibm.com Fri Jun 19 19:45:19 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 14:45:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stijn.deweirdt at ugent.be Fri Jun 19 20:01:04 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 21:01:04 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> Message-ID: <558466F0.8000300@ugent.be> >>> 1. YES, Native Raid can recover from various failures: drawers, cabling, >>> controllers, power supplies, etc, etc. >>> Of course it must be configured properly so that there is no possible >>> single point of failure. >> hmmm, this is not really what i was asking about. but maybe it's easier >> in gss to do this properly (eg for 8+3 data protection, you only need 11 >> drawers if you can make sure the data+parity blocks are send to >> different drawers (sort of per drawer failure group, but internal to the >> vdisks), and the smallest setup is a gss24 which has 20 drawers). >> but i can't rememeber any manual suggestion the admin can control this >> (or is it the default?). >> > > I got the impression that GNR was more in line with the Engenio dynamic > disk pools well, it's uses some crush-like placement and some parity encoding scheme (regular raid6 for the DDP, some flavour of EC for GNR), but other then that, not much resemblence. DDP does not give you any control over where the data blocks are stored. i'm not sure about GNR, (but DDP does not state anywhere they are drawer failure proof ;). but GNR is more like a DDP then e.g. a ceph EC pool, in the sense that the hosts needs to see all disks (similar to the controller that needs access to the disks). > > http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx > > http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf > > That is traditional RAID sucks with large numbers of big drives. (btw it's one of those that we saw fail (and get recovered by tech support!) this week. tip of the week: turn on the SMmonitor service on at least one host, it's actually useful for something). stijn > > > JAB. > From S.J.Thompson at bham.ac.uk Fri Jun 19 20:17:32 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 19:17:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>, Message-ID: My understanding I that GSS and IBM ESS are sold as pre configured systems. So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] Sent: 19 June 2015 19:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? From zgiles at gmail.com Fri Jun 19 21:08:14 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 16:08:14 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From S.J.Thompson at bham.ac.uk Fri Jun 19 22:08:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 21:08:25 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: I'm not disputing that gnr is a cool technology. Just that as scale out, it doesn't work for our funding model. If we go back to the original question, if was pros and cons of gnr vs raid type storage. My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] Sent: 19 June 2015 21:08 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Fri Jun 19 22:18:51 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:18:51 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: References: Message-ID: <5584873B.1080109@yale.edu> Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group From zgiles at gmail.com Fri Jun 19 22:35:59 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 17:35:59 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OK, back on topic: Honestly, I'm really glad you said that. I have that exact problem also -- a researcher will be funded for xTB of space, and we are told by the grants office that if something is purchased on a grant it belongs to them and it should have a sticker put on it that says "property of the govt' etc etc. We decided to (as an institution) put the money forward to purchase a large system ahead of time, and as grants come in, recover the cost back into the system by paying off our internal "negative balance". In this way we can get the benefit of a large storage system like performance and purchasing price, but provision storage into quotas as needed. We can even put stickers on a handful of drives in the GSS tray if that makes them feel happy. Could they request us to hand over their drives and take them out of our system? Maybe. if the Grants Office made us do it, sure, I'd drain some pools off and go hand them over.. but that will never happen because it's more valuable to them in our cluster than sitting on their table, and I'm not going to deliver the drives full of their data. That's their responsibility. Is it working? Yeah, but, I'm not a grants admin nor an accountant, so I'll let them figure that out, and they seem to be OK with this model. And yes, it's not going to work for all institutions unless you can put the money forward upfront, or do a group purchase at the end of a year. So I 100% agree, GNR doesn't really fit the model of purchasing a few drives at a time, and the grants things is still a problem. On Fri, Jun 19, 2015 at 5:08 PM, Simon Thompson (Research Computing - IT Services) wrote: > I'm not disputing that gnr is a cool technology. > > Just that as scale out, it doesn't work for our funding model. > > If we go back to the original question, if was pros and cons of gnr vs raid type storage. > > My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. > > And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] > Sent: 19 June 2015 21:08 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chris.hunter at yale.edu Fri Jun 19 22:57:14 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:57:14 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: References: Message-ID: <5584903A.3020203@yale.edu> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. A JBOD solution that allows incremental drive expansion is desirable. chris hunter yale hpc group > From: Zachary Giles > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > OK, back on topic: > Honestly, I'm really glad you said that. I have that exact problem > also -- a researcher will be funded for xTB of space, and we are told > by the grants office that if something is purchased on a grant it > belongs to them and it should have a sticker put on it that says > "property of the govt' etc etc. > We decided to (as an institution) put the money forward to purchase a > large system ahead of time, and as grants come in, recover the cost > back into the system by paying off our internal "negative balance". In > this way we can get the benefit of a large storage system like > performance and purchasing price, but provision storage into quotas as > needed. We can even put stickers on a handful of drives in the GSS > tray if that makes them feel happy. > Could they request us to hand over their drives and take them out of > our system? Maybe. if the Grants Office made us do it, sure, I'd drain > some pools off and go hand them over.. but that will never happen > because it's more valuable to them in our cluster than sitting on > their table, and I'm not going to deliver the drives full of their > data. That's their responsibility. > > Is it working? Yeah, but, I'm not a grants admin nor an accountant, so > I'll let them figure that out, and they seem to be OK with this model. > And yes, it's not going to work for all institutions unless you can > put the money forward upfront, or do a group purchase at the end of a > year. > > So I 100% agree, GNR doesn't really fit the model of purchasing a few > drives at a time, and the grants things is still a problem. From jhick at lbl.gov Fri Jun 19 23:18:56 2015 From: jhick at lbl.gov (Jason Hick) Date: Fri, 19 Jun 2015 15:18:56 -0700 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <5584903A.3020203@yale.edu> References: <5584903A.3020203@yale.edu> Message-ID: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. As opposed to dealing with racks of storage and architectural details. Jason > On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: > > I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. > > I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. > > We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. > > A JBOD solution that allows incremental drive expansion is desirable. > > chris hunter > yale hpc group > >> From: Zachary Giles >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >> themselves? How about GPFS Native Raid? >> >> OK, back on topic: >> Honestly, I'm really glad you said that. I have that exact problem >> also -- a researcher will be funded for xTB of space, and we are told >> by the grants office that if something is purchased on a grant it >> belongs to them and it should have a sticker put on it that says >> "property of the govt' etc etc. >> We decided to (as an institution) put the money forward to purchase a >> large system ahead of time, and as grants come in, recover the cost >> back into the system by paying off our internal "negative balance". In >> this way we can get the benefit of a large storage system like >> performance and purchasing price, but provision storage into quotas as >> needed. We can even put stickers on a handful of drives in the GSS >> tray if that makes them feel happy. >> Could they request us to hand over their drives and take them out of >> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >> some pools off and go hand them over.. but that will never happen >> because it's more valuable to them in our cluster than sitting on >> their table, and I'm not going to deliver the drives full of their >> data. That's their responsibility. >> >> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >> I'll let them figure that out, and they seem to be OK with this model. >> And yes, it's not going to work for all institutions unless you can >> put the money forward upfront, or do a group purchase at the end of a >> year. >> >> So I 100% agree, GNR doesn't really fit the model of purchasing a few >> drives at a time, and the grants things is still a problem. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 23:54:39 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 18:54:39 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> References: <5584903A.3020203@yale.edu> <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> Message-ID: Starting to sound like Seagate/Xyratex there. :) On Fri, Jun 19, 2015 at 6:18 PM, Jason Hick wrote: > For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. > > As opposed to dealing with racks of storage and architectural details. > > Jason > >> On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: >> >> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. >> >> I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. >> >> We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. >> >> A JBOD solution that allows incremental drive expansion is desirable. >> >> chris hunter >> yale hpc group >> >>> From: Zachary Giles >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >>> themselves? How about GPFS Native Raid? >>> >>> OK, back on topic: >>> Honestly, I'm really glad you said that. I have that exact problem >>> also -- a researcher will be funded for xTB of space, and we are told >>> by the grants office that if something is purchased on a grant it >>> belongs to them and it should have a sticker put on it that says >>> "property of the govt' etc etc. >>> We decided to (as an institution) put the money forward to purchase a >>> large system ahead of time, and as grants come in, recover the cost >>> back into the system by paying off our internal "negative balance". In >>> this way we can get the benefit of a large storage system like >>> performance and purchasing price, but provision storage into quotas as >>> needed. We can even put stickers on a handful of drives in the GSS >>> tray if that makes them feel happy. >>> Could they request us to hand over their drives and take them out of >>> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >>> some pools off and go hand them over.. but that will never happen >>> because it's more valuable to them in our cluster than sitting on >>> their table, and I'm not going to deliver the drives full of their >>> data. That's their responsibility. >>> >>> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >>> I'll let them figure that out, and they seem to be OK with this model. >>> And yes, it's not going to work for all institutions unless you can >>> put the money forward upfront, or do a group purchase at the end of a >>> year. >>> >>> So I 100% agree, GNR doesn't really fit the model of purchasing a few >>> drives at a time, and the grants things is still a problem. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From bsallen at alcf.anl.gov Sat Jun 20 00:12:53 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 23:12:53 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: <5584873B.1080109@yale.edu> References: , <5584873B.1080109@yale.edu> Message-ID: <3a261dc3-e8a4-4550-bab2-db4cc0ffbaea@alcf.anl.gov> Let me know what specific questions you have. Ben From: Chris Hunter Sent: Jun 19, 2015 4:18 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From viccornell at gmail.com Sat Jun 20 22:12:53 2015 From: viccornell at gmail.com (Vic Cornell) Date: Sat, 20 Jun 2015 22:12:53 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Just to make sure everybody is up to date on this, (I work for DDN BTW): > On 19 Jun 2015, at 21:08, Zachary Giles wrote: > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. With the 12K you can buy 1,2,3,4,5,,10 or 20. With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. Happy to expand on any of this on or offline. Vic > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Sat Jun 20 23:40:58 2015 From: zgiles at gmail.com (Zachary Giles) Date: Sat, 20 Jun 2015 18:40:58 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Message-ID: All true. I wasn't trying to knock DDN or say "it can't be done", it's just (probably) not very efficient or cost effective to buy a 12K with 30 drives (as an example). The new 7700 looks like a really nice base a small building block. I had forgot about them. There is a good box for adding 4U at a time, and with 60 drives per enclosure, if you saturated it out at ~3 enclosure / 180 drives, you'd have 1PB, which is also a nice round building block size. :thumb up: On Sat, Jun 20, 2015 at 5:12 PM, Vic Cornell wrote: > Just to make sure everybody is up to date on this, (I work for DDN BTW): > >> On 19 Jun 2015, at 21:08, Zachary Giles wrote: >> >> It's comparable to other "large" controller systems. Take the DDN >> 10K/12K for example: You don't just buy one more shelf of disks, or 5 >> disks at a time from Walmart. You buy 5, 10, or 20 trays and populate >> enough disks to either hit your bandwidth or storage size requirement. > > With the 12K you can buy 1,2,3,4,5,,10 or 20. > > With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. > > GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > >> Generally changing from 5 to 10 to 20 requires support to come on-site >> and recable it, and generally you either buy half or all the disks >> slots worth of disks. > > You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > >> The whole system is a building block and you buy >> N of them to get up to 10-20PB of storage. >> GSS is the same way, there are a few models and you just buy a packaged one. >> >> Technically, you can violate the above constraints, but then it may >> not work well and you probably can't buy it that way. >> I'm pretty sure DDN's going to look at you funny if you try to buy a >> 12K with 30 drives.. :) > > Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. > > Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. > > Happy to expand on any of this on or offline. > > Vic > > >> >> For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save >> money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with >> buildin RAID, a pair of servers, and forget GNR. >> Or maybe GSS22? :) >> >> From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 >> " >> Current high-density storage Models 24 and 26 remain available >> Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u >> JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) >> 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available >> 200 GB and 800 GB SSDs are also available >> The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, >> 26s is comprised of SSD drives or 1.2 TB hard SAS drives >> " >> >> >> On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - >> IT Services) wrote: >>> >>> My understanding I that GSS and IBM ESS are sold as pre configured systems. >>> >>> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >>> >>> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >>> >>> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >>> >>> Simon >>> ________________________________________ >>> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >>> Sent: 19 June 2015 19:45 >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >>> >>> OOps... here is the official statement: >>> >>> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >>> >>> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> -- >> Zach Giles >> zgiles at gmail.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chair at gpfsug.org Mon Jun 22 08:57:49 2015 From: chair at gpfsug.org (GPFS UG Chair) Date: Mon, 22 Jun 2015 08:57:49 +0100 Subject: [gpfsug-discuss] chair@GPFS UG Message-ID: Hi all, Just to follow up from Jez's email last week I'm now taking over as chair of the group. I'd like to thank Jez for his work with the group over the past couple of years in developing it to where it is now (as well as Claire who is staying on as secretary!). We're still interested in sector reps for the group, so if you are a GPFS user in a specific sector and would be interested in this, please let me know. As there haven't really been any sector reps before, we'll see how that works out, but I can't see it being a lot of work! On the US side of things, I need to catch up with Jez and Claire to see where things are up to. And finally, just as a quick head's up, we're pencilled in to have a user group mini (2hr) meeting in the UK in December as one of the breakout groups at the annual MEW event, once the dates for this are published I'll send out a save the date. If you are a user and interested in speaking, also let me know as well as anything else you might like to see there. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Jun 22 14:04:23 2015 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 22 Jun 2015 13:04:23 +0000 Subject: [gpfsug-discuss] Placement Policy Installation andRDMConsiderations In-Reply-To: References: , Message-ID: <201506221305.t5MD5Owv014072@d01av05.pok.ibm.com> An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:28:22 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:28:22 +0100 Subject: [gpfsug-discuss] LROC Express Message-ID: <55882996.6050903@pixitmedia.com> Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oester at gmail.com Mon Jun 22 16:36:08 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:36:08 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882996.6050903@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> Message-ID: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: > Hi All, > > Very quick question for those in the know - does LROC require a standard > license, or will it work with Express? I can't find anything in the FAQ > regarding this so I presume Express is ok, but wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:39:49 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:39:49 +0100 Subject: [gpfsug-discuss] LROC Express In-Reply-To: References: <55882996.6050903@pixitmedia.com> Message-ID: <55882C45.6090501@pixitmedia.com> Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: > It works with Standard edition, just make sure you have the right > license for the nodes using LROC. > > Bob Oesterlin > Nuance COmmunications > > > Bob Oesterlin > > > On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: > > Hi All, > > Very quick question for those in the know - does LROC require a > standard license, or will it work with Express? I can't find > anything in the FAQ regarding this so I presume Express is ok, but > wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the > exclusive attention of the addressee(s) indicated. If you are not > the intended recipient, this email should not be read or disclosed > to any other person. Please notify the sender immediately and > delete this email from your computer system. Any opinions > expressed are not necessarily those of the company from which this > email was sent and, whilst to the best of our knowledge no viruses > or defects exist, no responsibility can be accepted for any loss > or damage arising from its receipt or subsequent use of this email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oester at gmail.com Mon Jun 22 16:45:33 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:45:33 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: I only have a Standard Edition, so I can't say for sure. I do know it's Linux x86 only. This doesn't seem to say directly either: http://www-01.ibm.com/support/knowledgecenter/SSFKCN/gpfs4104/gpfsclustersfaq.html%23lic41?lang=en Bob Oesterlin On Mon, Jun 22, 2015 at 10:39 AM, Barry Evans wrote: > Hi Bob, > > Thanks for this, just to confirm does this mean that it *does not* work > with express? > > Cheers, > Barry > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Mon Jun 22 23:57:10 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Mon, 22 Jun 2015 22:57:10 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [http://www.pixitmedia.com/sig/sig-cio.jpg] This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Tue Jun 23 00:14:09 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Mon, 22 Jun 2015 16:14:09 -0700 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> Message-ID: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ From: "Sanchez, Paul" To: gpfsug main discussion list Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Paul.Sanchez at deshaw.com Tue Jun 23 15:10:31 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Tue, 23 Jun 2015 14:10:31 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418BC0@mailnycmb2a.winmail.deshaw.com> Hi Sven, Yes, I think that fileset level include/exclude would be sufficient for us. It also begs the question about the same for write caching. We haven?t experimented with it yet, but are looking forward to employing HAWC for scratch-like workloads. Do you imagine providing the same sort of HAWC bypass include/exclude to be part of this? That might be useful for excluding datasets where the write ingest rate isn?t massive and the degree of risk we?re comfortable with potential data recovery issues in the face of complex outages may be much lower. Thanks, Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Sven Oehme Sent: Monday, June 22, 2015 7:14 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ [Inactive hide details for "Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we]"Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we?re also running standard. But as a simple t From: "Sanchez, Paul" > To: gpfsug main discussion list > Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org ________________________________ I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From ewahl at osc.edu Tue Jun 23 15:11:11 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 23 Jun 2015 14:11:11 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: , <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> FYI this page causes problems with various versions of Chrome and Firefox (too lazy to test other browsers, sorry) Seems to be a javascript issue. Huge surprise, right? I've filed bugs on the browser sides for FF, don't care about chrome sorry. Ed Wahl OSC ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ewahl at osc.edu] Sent: Monday, June 15, 2015 4:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 4.1.1 fix central location When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Tue Jun 23 15:16:10 2015 From: oester at gmail.com (Bob Oesterlin) Date: Tue, 23 Jun 2015 09:16:10 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Try here: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all Bob Oesterlin On Tue, Jun 23, 2015 at 9:11 AM, Wahl, Edward wrote: > FYI this page causes problems with various versions of Chrome and > Firefox (too lazy to test other browsers, sorry) Seems to be a javascript > issue. Huge surprise, right? > > I've filed bugs on the browser sides for FF, don't care about chrome > sorry. > > Ed Wahl > OSC > > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ > ewahl at osc.edu] > *Sent:* Monday, June 15, 2015 4:35 PM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] 4.1.1 fix central location > > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From orlando.richards at ed.ac.uk Wed Jun 24 12:27:25 2015 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Wed, 24 Jun 2015 12:27:25 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <558A941D.2010206@ed.ac.uk> Hi all, I'm looking to deploy to RedHat 7.1, but from the GPFS FAQ only versions 4.1.1 and 3.5.0-26 are supported. I can't see a release of 3.5.0-26 on the fix central website - does anyone know if this is available? Will 3.5.0-25 work okay on RH7.1? How about 4.1.0-x - any plans to support that on RH7.1? ------- Orlando. On 16/06/15 09:11, Simon Thompson (Research Computing - IT Services) wrote: > The docs also now seem to be in Spectrum Scale section at: > > http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html > > Simon > > From: Ross Keeping3 > > Reply-To: gpfsug main discussion list > > Date: Monday, 15 June 2015 17:43 > To: "gpfsug-discuss at gpfsug.org " > > > Subject: [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central > you will likely be disappointed. Work is ongoing to ensure this becomes > more intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab > *Phone:*(+44 161) 8362381*-Line:*37642381* > E-mail: *ross.keeping at uk.ibm.com > IBM > > 3rd Floor, Maybrook House > > Manchester, M3 2EG > > United Kingdom > > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Services Manager Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From chris.hunter at yale.edu Wed Jun 24 18:26:11 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Wed, 24 Jun 2015 13:26:11 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Message-ID: <558AE833.6070803@yale.edu> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group From ewahl at osc.edu Wed Jun 24 18:47:19 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Wed, 24 Jun 2015 17:47:19 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5A174@CIO-KRC-D1MBX02.osuad.osu.edu> Both are available to you directly. in Linux anyway. My AIX knowledge is decades old. And yes, the HBAs have much more availability/data of course. What kind of monitoring are you looking to do? Fault? Take the data and ?? nagios/cactii/ganglia/etc? Mine it with Splunk? Expand the GPFS Monitor suite? sourceforge.net/projects/gpfsmonitorsuite (though with sourceforge lately, perhaps we should ask Pam et al. to move them?) Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Chris Hunter [chris.hunter at yale.edu] Sent: Wednesday, June 24, 2015 1:26 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bsallen at alcf.anl.gov Wed Jun 24 18:48:47 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Wed, 24 Jun 2015 17:48:47 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <64F35432-4DFC-452C-8965-455BCF7E2F09@alcf.anl.gov> Checkout https://github.com/leibler/check_mk-sas2ircu. This is obviously check_mk specific, but a reasonable example. Ben > On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: > > Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? > > We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. > > regards, > chris hunter > yale hpc group > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Mon Jun 29 17:07:12 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 29 Jun 2015 12:07:12 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: References: Message-ID: <55916D30.3050709@yale.edu> Thanks for the info. We settled on a simpler perl wrapper around sas2ircu form nagios exchange. chris hunter yale hpc group > Checkout https://github.com/leibler/check_mk-sas2ircu This is obviously check_mk specific, but a reasonable example. Ben >> On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: >> >> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? >> >> We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. >> >> regards, >> chris hunter >> yale hpc group From st.graf at fz-juelich.de Tue Jun 30 07:54:18 2015 From: st.graf at fz-juelich.de (Graf, Stephan) Date: Tue, 30 Jun 2015 06:54:18 +0000 Subject: [gpfsug-discuss] ESS/GSS GUI (Monitoring) Message-ID: <38A0607912A90F4880BDE29022E093054087CF1A@MBX2010-E01.ad.fz-juelich.de> Hi! If anyone is interested in a simple GUI for GSS/ESS we have one developed for our own (in the time when there was no GUI available). It is java based and the only requirement is to have passwordless access to the GSS nodes. (We start the GUI on our xCAT server). I have uploaded some screenshots: https://www.dropbox.com/sh/44kln4h7wgp18uu/AADsllhSxOdIeWtkNSaftu8Sa?dl=0 Stephan ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Vogel at abcsystems.ch Tue Jun 30 08:49:10 2015 From: Daniel.Vogel at abcsystems.ch (Daniel Vogel) Date: Tue, 30 Jun 2015 07:49:10 +0000 Subject: [gpfsug-discuss] GPFS 4.1.1 without QoS for mmrestripefs? Message-ID: <2CDF270206A255459AC4FA6B08E52AF90114634DD0@ABCSYSEXC1.abcsystems.ch> Hi Years ago, IBM made some plan to do a implementation "QoS for mmrestripefs, mmdeldisk...". If a "mmfsrestripe" is running, very poor performance for NFS access. I opened a PMR to ask for QoS in version 4.1.1 (Spectrum Scale). PMR 61309,113,848: I discussed the question of QOS with the development team. These command changes that were noticed are not meant to be used as GA code which is why they are not documented. I cannot provide any further information from the support perspective. Anybody knows about QoS? The last hope was at "GPFS Workshop Stuttgart M?rz 2015" with Sven Oehme as speaker. Daniel Vogel IT Consultant ABC SYSTEMS AG Hauptsitz Z?rich R?tistrasse 28 CH - 8952 Schlieren T +41 43 433 6 433 D +41 43 433 6 467 http://www.abcsystems.ch ABC - Always Better Concepts. Approved By Customers since 1981. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 8 10:06:56 2015 From: chair at gpfsug.org (Jez Tucker) Date: Mon, 08 Jun 2015 10:06:56 +0100 Subject: [gpfsug-discuss] Election news Message-ID: <55755B30.9070601@gpfsug.org> Hello all Simon Thompson (Research Computing, University of Birmingham) has put himself forward as sole candidate for position of Chair. I firmly believe it in the best interest of the Group that we do not have the same Chair indefinitely. The Group is on fine footing, so now is an appropriate time for change. Having spoken with Simon, the UG Committee are more than happy to recommend him for the position of Chair for the next two years. Over the same period, the UG Committee has proposed that I move to represent Media (as per sector representatives) and continue to support the efforts of the Group where appropriate. The UG Committee would also like to recommend Ross Keeping for the IBM non-exec position. Some of you will have met Ross at the recent User Group. He understands the focus and needs of the Group and will act as the group's plug-in to IBM as well as hosting the 'Meet the Devs' events (details on the next one soon). With respect to the above, we do not believe it is prudent to spend time and resource on election scaffolding to vote for a single candidate. We would suggest that if the majority of members are extremely against this move that it is discussed openly in the mailing list. Discussion is good! Failing any overwhelming response to the contrary, Simon will assume position of Chair on 19th June 2015 with the Committee?s full support. Best regards, Jez (Chair) and Claire (Secretary) -------- Simon's response to the Election call follows verbatim: a) The post they wish to stand for Group chair b) A paragraph covering their credentials I have been working with GPFS for the past few years, initially in an HPC environment and more recently using it to deliver our research data and OpenStack platforms. The research storage platform was developed in conjunction with OCF, our IBM business partner which spans both spinning disk and TSM HSM layer. I have spoken at both the UK GPFS user group and at the GPFS user forum in the USA. In addition to this I've made a short customer video used by IBM marketing. Linked in profile: uk.linkedin.com/in/simonjthompson1 Blog: www.roamingzebra.co.uk c) A paragraph covering what they would bring to the group I already have a good working relationship with GPFS developers having spent the past few months building our OpenStack platform working with IBM and documenting how to use some of the features, and would look to build on this relationship to develop the GPFS user group. I've also blogged many of the bits I have experimented with and would like to see this develop with the group contributing to a wiki style information source with specific examples of technology and configs. In addition to this, I have support from my employer to attend meetings and conferences and would be happy to represent and promote the group at these as well as bringing feedback. d) A paragraph setting out their vision for the group for the next two years I would like to see the group engaging with more diverse users of GPFS as many of those attending the meetings are from HPC type environments, so I would loom to work with both IBM and resellers to help engage with other industries using GPFS technology. Ideally this would see more customer talks at the user group in addition to a balanced view from IBM on road maps. I think it would also be good to focus on specific features of gpfs and how they work and can be applied as my suspicion is very few customers use lots of features to full advantage. Simon -------- ends -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Mon Jun 15 09:35:18 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Mon, 15 Jun 2015 08:35:18 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at gpfsug.org Mon Jun 15 14:24:57 2015 From: chair at gpfsug.org (Jez Tucker (Chair)) Date: Mon, 15 Jun 2015 14:24:57 +0100 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Message-ID: <557ED229.50902@gpfsug.org> Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) From ewahl at osc.edu Mon Jun 15 14:59:44 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 13:59:44 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Jun 15 15:10:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 15 Jun 2015 14:10:25 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: Manilla is one of the projects to provide ?shared? access to file-systems. I thought that at the moment, Manilla doesn?t support the GPFS protocol but is implemented on top of Ganesha so it provided as NFS access. So you wouldn?t get mmunlinkfileset. This sorta brings me back to one of the things I talked about at the GPFS UG, as in the GPFS security model is trusting, which in multi-tenant environments is a bad thing. I know I?ve spoken to a few people recently who?ve commented / agreed / had thoughts on it, so can I ask that if multi-tenancy security is something that you think is of concern with GPFS, can you drop me an email (directly is fine) which your use case and what sort of thing you?d like to see, then I?ll collate this and have a go at talking to IBM again about this. Thanks Simon From: , Edward > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 14:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Mon Jun 15 15:16:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Mon, 15 Jun 2015 15:16:44 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: Message-ID: <1434377805.15671.126.camel@buzzard.phy.strath.ac.uk> On Mon, 2015-06-15 at 08:35 +0000, Luke Raimbach wrote: > Dear All, > > We are looking forward to using the manila driver for > auto-provisioning of file shares using GPFS. However, I have some > concerns... > > > Manila presumably gives tenant users access to file system commands > like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset > quiesces the file system, there is potentially an impact from one > tenant on another - i.e. someone unlinking and deleting a lot of > filesets during a tenancy cleanup might cause a cluster pause long > enough to trigger other failure events or even start evicting nodes. > You can see why this would be bad in a cloud environment. Er as far as I can see in the documentation no you don't. My personal experience is mmunlinkfileset has a habit of locking the file system up; aka don't do while the file system is busy. On the other hand mmlinkfileset you can do with gay abandonment. Might have changed in more recent version of GPFS. On the other hand you do get access to creating/deleting snapshots which on the deleting side has in the past for me personally has caused file system lockups. Similarly creating a snapshot no problem. The difference between the two is things that require quiescence to take away from the file system can cause bad things happen. Quiescence to add things to the file system rarely if ever cause problems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From chris.hunter at yale.edu Mon Jun 15 15:35:06 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 15 Jun 2015 10:35:06 -0400 Subject: [gpfsug-discuss] OpenStack Manila Driver Message-ID: <557EE29A.4070909@yale.edu> Although likely not the access model you are seeking, GPFS is mentioned for the swift-on-file project: * https://github.com/stackforge/swiftonfile Openstack Swift uses HTTP/REST protocol for file access (ala S3), not the best choice for data-intensive applications. regards, chris hunter yale hpc group --- Date: Mon, 15 Jun 2015 08:35:18 +0000 From: Luke Raimbach To: "gpfsug-discuss at gpfsug.org" Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE From ross.keeping at uk.ibm.com Mon Jun 15 17:43:07 2015 From: ross.keeping at uk.ibm.com (Ross Keeping3) Date: Mon, 15 Jun 2015 17:43:07 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location Message-ID: Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 360 bytes Desc: not available URL: From ewahl at osc.edu Mon Jun 15 21:35:04 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Mon, 15 Jun 2015 20:35:04 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Mon Jun 15 21:38:39 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 15 Jun 2015 15:38:39 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It took me a while to find it too - Key is to search on "Spectrum Scale". Try this URL: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all If you don't want X86, just select the appropriate platform. Bob Oesterlin Nuance Communications On Mon, Jun 15, 2015 at 3:35 PM, Wahl, Edward wrote: > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Tue Jun 16 08:36:56 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 07:36:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ________________________________ ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Jun 16 09:11:23 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:11:23 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: The docs also now seem to be in Spectrum Scale section at: http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html Simon From: Ross Keeping3 > Reply-To: gpfsug main discussion list > Date: Monday, 15 June 2015 17:43 To: "gpfsug-discuss at gpfsug.org" > Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone:(+44 161) 8362381-Line:37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From jonathan at buzzard.me.uk Tue Jun 16 09:40:30 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 09:40:30 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 07:36 +0000, Luke Raimbach wrote: [SNIP] > The tenants don?t have root access to the file system, but the Manila > component must act as a wrapper to file system administrative > equivalents like mmcrfileset, mmdelfileset, link and unlink. The > shares are created as GPFS filesets which are then presented over NFS. > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From adam.huffman at crick.ac.uk Tue Jun 16 09:41:56 2015 From: adam.huffman at crick.ac.uk (Adam Huffman) Date: Tue, 16 Jun 2015 08:41:56 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: The presentation of shared storage by Manila isn?t necessarily via NFS. Some of the drivers, I believe the GPFS one amongst them, allow some form of native connection either via the guest or via a VirtFS connection to the client on the hypervisor. Best Wishes, Adam ? > On 16 Jun 2015, at 08:36, Luke Raimbach wrote: > > So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. > > The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. > > The unlinking of the fileset worries me for the reasons stated previously. > > From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward > Sent: 15 June 2015 15:00 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] OpenStack Manila Driver > > Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. > > I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? > > Ed Wahl > OSC > > > > ++ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] > Sent: Monday, June 15, 2015 4:35 AM > To: gpfsug-discuss at gpfsug.org > Subject: [gpfsug-discuss] OpenStack Manila Driver > > Dear All, > > We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... > > Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. > > > Has this scenario been addressed at all? > > Cheers, > Luke. > > > Luke Raimbach? > Senior HPC Data and Storage Systems Engineer > The Francis Crick Institute > Gibbs Building > 215 Euston Road > London NW1 2BE > > E: luke.raimbach at crick.ac.uk > W: www.crick.ac.uk > > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Tue Jun 16 09:46:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 16 Jun 2015 08:46:52 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I didn;t think that the *current* Manilla driver user GPFS protocol, but sat on top of Ganesha server. Simon On 16/06/2015 09:41, "Adam Huffman" wrote: > >The presentation of shared storage by Manila isn?t necessarily via NFS. >Some of the drivers, I believe the GPFS one amongst them, allow some form >of native connection either via the guest or via a VirtFS connection to >the client on the hypervisor. > >Best Wishes, >Adam > > >? > > > > > >> On 16 Jun 2015, at 08:36, Luke Raimbach >>wrote: >> >> So as I understand things, Manila is an OpenStack component which >>allows tenants to create and destroy shares for their instances which >>would be accessed over NFS. Perhaps I?ve not done enough research in to >>this though ? I?m also not an OpenStack expert. >> >> The tenants don?t have root access to the file system, but the Manila >>component must act as a wrapper to file system administrative >>equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares >>are created as GPFS filesets which are then presented over NFS. >> >> The unlinking of the fileset worries me for the reasons stated >>previously. >> >> From: gpfsug-discuss-bounces at gpfsug.org >>[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward >> Sent: 15 June 2015 15:00 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver >> >> Perhaps I misunderstand here, but if the tenants have administrative >>(ie:root) privileges to the underlying file system management commands I >>think mmunlinkfileset might be a minor concern here. There are FAR more >>destructive things that could occur. >> >> I am not an OpenStack expert and I've not even looked at anything past >>Kilo, but my understanding was that these commands were not necessary >>for tenants. They access a virtual block device that backs to GPFS, >>correct? >> >> Ed Wahl >> OSC >> >> >> >> ++ >> From: gpfsug-discuss-bounces at gpfsug.org >>[gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach >>[Luke.Raimbach at crick.ac.uk] >> Sent: Monday, June 15, 2015 4:35 AM >> To: gpfsug-discuss at gpfsug.org >> Subject: [gpfsug-discuss] OpenStack Manila Driver >> >> Dear All, >> >> We are looking forward to using the manila driver for auto-provisioning >>of file shares using GPFS. However, I have some concerns... >> >> Manila presumably gives tenant users access to file system commands >>like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset >>quiesces the file system, there is potentially an impact from one tenant >>on another - i.e. someone unlinking and deleting a lot of filesets >>during a tenancy cleanup might cause a cluster pause long enough to >>trigger other failure events or even start evicting nodes. You can see >>why this would be bad in a cloud environment. >> >> >> Has this scenario been addressed at all? >> >> Cheers, >> Luke. >> >> >> Luke Raimbach? >> Senior HPC Data and Storage Systems Engineer >> The Francis Crick Institute >> Gibbs Building >> 215 Euston Road >> London NW1 2BE >> >> E: luke.raimbach at crick.ac.uk >> W: www.crick.ac.uk >> >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> The Francis Crick Institute Limited is a registered charity in England >>and Wales no. 1140062 and a company registered in England and Wales no. >>06885462, with its registered office at 215 Euston Road, London NW1 2BE. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >The Francis Crick Institute Limited is a registered charity in England >and Wales no. 1140062 and a company registered in England and Wales no. >06885462, with its registered office at 215 Euston Road, London NW1 2BE. >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at gpfsug.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Luke.Raimbach at crick.ac.uk Tue Jun 16 09:48:50 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Tue, 16 Jun 2015 08:48:50 +0000 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: [SNIP] >> The tenants don?t have root access to the file system, but the Manila >> component must act as a wrapper to file system administrative >> equivalents like mmcrfileset, mmdelfileset, link and unlink. The >> shares are created as GPFS filesets which are then presented over NFS. >> > What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it. I think you are right. Looking over the various resources I have available, the creation, deletion, linking and unlinking of filesets is not implemented, but commented on as needing to be done. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From jonathan at buzzard.me.uk Tue Jun 16 10:25:45 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 16 Jun 2015 10:25:45 +0100 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk> Message-ID: <1434446745.15671.134.camel@buzzard.phy.strath.ac.uk> On Tue, 2015-06-16 at 08:48 +0000, Luke Raimbach wrote: [SNIP] > I think you are right. Looking over the various resources I have > available, the creation, deletion, linking and unlinking of filesets is > not implemented, but commented on as needing to be done. That's going to be a right barrel of laughs as reliability goes out the window if they do implement it. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From billowen at us.ibm.com Thu Jun 18 05:22:25 2015 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 17 Jun 2015 22:22:25 -0600 Subject: [gpfsug-discuss] OpenStack Manila Driver In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Hi Luke, Your explanation below is correct, with some minor clarifications Manila is an OpenStack project which allows storage admins to create and destroy filesystem shares and make those available to vm instances and bare metal servers which would be accessed over NFS. The Manila driver runs in the control plane and creates a new gpfs independent fileset for each new share. It provides automation for giving vm's (and also bare metal servers) acces to the shares so that they can mount and use the share. There is work being done to allow automating the mount process when the vm instance boots. The tenants don?t have root access to the file system, but the Manila component acts as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The manila driver uses the following gpfs commands: When a share is created: mmcrfileset mmlinkfileset mmsetquota When a share is deleted: mmunlinkfileset mmdelfileset Snapshots of shares can be created and deleted: mmcrsnapshot mmdelsnapshot Today, the GPFS Manila driver supports creating NFS exports to VMs. We are considering adding native GPFS client support in the VM, but not sure if the benefit justifies the extra complexity of having gpfs client in vm image, and also the impact to cluster as vm's come up and down in a more dynamic way than physical nodes. For multi-tenant deployments, we recommend using a different filesystem per tenant to provide better separation of data, and to minimize the "noisy neighbor" effect for operations like mmunlinkfileset. Here is a presentation that shows an overview of the GPFS Manila driver: (See attached file: OpenStack_Storage_Manila_with_GPFS.pdf) Perhaps this, and other GPFS & OpenStack topics could be the subject of a future user group session. Regards, Bill Owen billowen at us.ibm.com GPFS and OpenStack 520-799-4829 From: Luke Raimbach To: gpfsug main discussion list Date: 06/16/2015 12:37 AM Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Sent by: gpfsug-discuss-bounces at gpfsug.org So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert. The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS. The unlinking of the fileset worries me for the reasons stated previously. From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward Sent: 15 June 2015 15:00 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] OpenStack Manila Driver Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur. I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct? Ed Wahl OSC ++ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk] Sent: Monday, June 15, 2015 4:35 AM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] OpenStack Manila Driver Dear All, We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns... Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment. Has this scenario been addressed at all? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenStack_Storage_Manila_with_GPFS.pdf Type: application/pdf Size: 354887 bytes Desc: not available URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 13:30:40 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 12:30:40 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Message-ID: Hi All, Something I am thinking about doing is utilising the placement policy engine to insert custom metadata tags upon file creation, based on which fileset the creation occurs in. This might be to facilitate Research Data Management tasks that could happen later in the data lifecycle. I am also thinking about allowing users to specify additional custom metadata tags (maybe through a fancy web interface) and also potentially give users control over creating new filesets (e.g. for scientists running new experiments). So? pretend this is a placement policy on my GPFS driven data-ingest platform: RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' The fileset name can be meaningless (as far as the user is concerned), but would be linked somewhere nice that they recognise ? say /gpfs/incoming/instrument1. The fileset, when it is created, would also be an AFM cache for its ?home? counterpart which exists on a much larger (also GPFS driven) pool of storage? so that my metadata tags are preserved, you see. This potentially user driven activity might look a bit like this: - User logs in to web interface and creates new experiment - Filesets (system-generated names) are created on ?home? and ?ingest? file systems and linked into the directory namespace wherever the user specifies - AFM relationships are set up and established for the ingest (cache) fileset to write back to the AFM home fileset (probably Independent Writer mode) - A set of ?default? policies are defined and installed on the cache file system to tag data for that experiment (the user can?t change these) - The user now specifies additional metadata tags they want added to their experiment data (some of this might be captured through additional mandatory fields in the web form for instance) - A policy for later execution by mmapplypolicy on the AFM home file system is created which looks for the tags generated at ingest-time and applies the extra user-defined tags There?s much more that would go on later in the lifecycle to take care of automated HSM tiering, data publishing, movement and cataloguing of data onto external non GPFS file systems, etc. but I won?t go in to it here. My GPFS related questions are: When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. What is the specific limitation for having a policy placement file no larger than 1MB? Cheers, Luke. Luke Raimbach? Senior HPC Data and Storage Systems Engineer The Francis Crick Institute Gibbs Building 215 Euston Road London NW1 2BE E: luke.raimbach at crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 14:18:34 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 09:18:34 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Jun 18 14:27:52 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Thu, 18 Jun 2015 13:27:52 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: I can see exactly where Luke?s suggestion would be applicable. We might have several hundred active research projects which would have some sort of internal identifier, so I can see why you?d want to do this sort of tagging as it would allow a policy scan to find files related to specific projects (for example). Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 14:18 To: gpfsug main discussion list >, "luke.raimbach at crick.ac.uk" > Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luke.Raimbach at crick.ac.uk Thu Jun 18 14:35:32 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Thu, 18 Jun 2015 13:35:32 +0000 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. From: Marc A Kaplan [mailto:makaplan at us.ibm.com] Sent: 18 June 2015 14:19 To: gpfsug main discussion list; Luke Raimbach Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ... which looks nicer to some people than WHERE ( ... ) AND setXattr(...) Answers: (1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started using it yet. If that is undesirable, you can quiesce. (2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time. --marc of GPFS From: Luke Raimbach > ... RULE 'RDMTEST' SET POOL 'instruments? FOR FILESET ('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') WHERE SetXattr ('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a') AND SetXattr ('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0') RULE 'DEFAULT' SET POOL 'data' ... (1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong. ... (2) What is the specific limitation for having a policy placement file no larger than 1MB? ... The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo_fumagalli at it.ibm.com Thu Jun 18 14:38:00 2015 From: massimo_fumagalli at it.ibm.com (Massimo Fumagalli) Date: Thu, 18 Jun 2015 15:38:00 +0200 Subject: [gpfsug-discuss] ILM question Message-ID: Please, I need to know a simple question. Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to library). Then we need to read a file that has been moved to library (or Tier1). Will be file copied back to Tier 0? Or read will be executed directly from Library or Tier1 ? since there can be performance issue Regards Max IBM Italia S.p.A. Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) Cap. Soc. euro 347.256.998,80 C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 Societ? con unico azionista Societ? soggetta all?attivit? di direzione e coordinamento di International Business Machines Corporation (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From ewahl at osc.edu Thu Jun 18 15:08:29 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 14:08:29 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) Ed Wahl OSC From Paul.Sanchez at deshaw.com Thu Jun 18 15:52:07 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Thu, 18 Jun 2015 14:52:07 +0000 Subject: [gpfsug-discuss] Member locations for Dev meeting organisation In-Reply-To: <557ED229.50902@gpfsug.org> References: <557ED229.50902@gpfsug.org> Message-ID: <201D6001C896B846A9CFC2E841986AC1454124B2@mailnycmb2a.winmail.deshaw.com> Thanks Jez, D. E. Shaw is based in New York, NY. We have 3-4 engineers/architects who would attend. Additionally, if you haven't heard from D. E. Shaw Research, they're next-door and have another 2. -Paul Sanchez Sent with Good (www.good.com) ________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Jez Tucker (Chair) Sent: Monday, June 15, 2015 9:24:57 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Member locations for Dev meeting organisation Hello all It would be very handy if all members could send me an email to: chair at gpfsug.org with the City and Country in which you are located. We're looking to place 'Meet the Devs' coffee-shops close to you, so this would make planning several orders of magnitude easier. I can infer from each member's email, but it's only 'mostly accurate'. Stateside members - we're actively organising a first meet up near you imminently, so please ping me your locations. All the best, Jez (Chair) _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Jun 18 16:36:49 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 11:36:49 -0400 Subject: [gpfsug-discuss] Placement Policy Installation and RDM Considerations In-Reply-To: References: Message-ID: (1) There is no secret flag. I assume that the existing policy is okay but the new one is better. So start using the better one ASAP, but why stop the system if you don't have to? The not secret way to quiesce/resume a filesystem without unmounting is fsctl {suspend | suspend-write | resume}; (2) The policy rules text is passed as a string through a GPFS rpc protocol (not a standard RPC) and the designer/coder chose 1MB as a safety-limit. I think it could be increased, but suppose you did have 4000 rules, each 200 bytes - you'd be at 800KB, still short of the 1MB limit. (x) Personally, I wouldn't worry much about setting, say 10 extended attribute values in each rule. I'd worry more about the impact of having 100s of rules. (y) When designing/deploying a new GPFS filesystem, consider explicitly setting the inode size so that all anticipated extended attributes will be stored in the inode, rather than spilling into other disk blocks. See mmcrfs ... -i InodeSize. You can build a test filesystem with just one NSD/LUN and test your anticipated usage. Use tsdbfs ... xattr ... to see how EAs are stored. Caution: tsdbfs display commands are harmless, BUT there are some patch and patch-like subcommands that could foul up your filesystem. From: Luke Raimbach Hi Marc, Thanks for the pointer to the updated syntax. That indeed looks nicer. (1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy? (2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view. Thanks for the responses. Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Thu Jun 18 17:02:54 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:02:54 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Sorry to hear about the problems you've had recently. It's frustrating when that happens. I didn't have the exactly the same situation, but we had something similar which may bring some light to a missing disks situation: We had a dataOnly storage pool backed by a few building blocks each consisting of several RAID controllers that were each direct attached to a few servers. We had several of these sets all in one pool. Thus, if a server failed it was fine, if a single link failed, it was fine. Potentially we could do copies=2 and have multiple failure groups in a single pool. If anything in the RAID arrays themselves failed, it was OK, but a single whole RAID controller going down would take that section of disks down. The number of copies was set to 1 on this pool. One RAID controller went down, but the file system as a whole stayed online. Our user experience was that Some users got IO errors of a "file inaccessible" type (I don't remember the exact code). Other users, and especially those mostly in other tiers continued to work as normal. As we had mostly small files across this tier ( much smaller than the GPFS block size ), most of the files were in one of the RAID controllers or another, thus not striping really, so even the files in other controllers on the same tier were also fine and accessible. Bottom line is: Only the files that were missing gave errors, the others were fine. Additionally, for missing files errors were reported which apps could capture and do something about, wait, or retry later -- not a D state process waiting forever or stale file handles. I'm not saying this is the best way. We didn't intend for this to happen. I suspect that stopping the disk would result in a similar experience but more safely. We asked GPFS devs if we needed to fsck after this since the tier just went offline directly and we continued to use the rest of the system while it was gone.. they said no it should be fine and missing blocks will be taken care of. I assume this is true, but I have no explicit proof, except that it's still working and nothing seemed to be missing. I guess some questions for the dev's would be: * Is this safe / advisable to do the above either directly or via a stop and then down the array? * Given that there is some client-side write caching in GPFS, if a file is being written and an expected final destination goes offline mid-write, where does the block go? + If a whole pool goes offline, will it pick another pool or error? + If it's a disk in a pool, will it reevaluate and round-robin to the next disk, or just fail since it had already decided where to write? Hope this helps a little. On Thu, Jun 18, 2015 at 10:08 AM, Wahl, Edward wrote: > We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts? > > In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there. > > I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. > > I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;) > > Ed Wahl > OSC > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From zgiles at gmail.com Thu Jun 18 17:06:33 2015 From: zgiles at gmail.com (Zachary Giles) Date: Thu, 18 Jun 2015 12:06:33 -0400 Subject: [gpfsug-discuss] ILM question In-Reply-To: References: Message-ID: I would expect it to return to one of your online tiers. If you tier between two storage pools, you can directly read and write those files. Think of how LTFS works -- it's an external storage pool, so you need to run an operation via an external command to give the file back to GPFS from which you can read it. This is controlled via the policies and I assume you would need to make a policy to specify where the file would be placed when it comes back. It would be fancy for someone to allow reading directly from an external pool, but as far as I know, it has to hit a disk first. What I don't know is: Will it begin streaming the files back to the user as the blocks hit the disk, while other blocks are still coming in, or must the whole file be recalled first? On Thu, Jun 18, 2015 at 9:38 AM, Massimo Fumagalli < massimo_fumagalli at it.ibm.com> wrote: > Please, I need to know a simple question. > > Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating > files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to > library). > Then we need to read a file that has been moved to library (or Tier1). > Will be file copied back to Tier 0? Or read will be executed directly from > Library or Tier1 ? since there can be performance issue > > Regards > Max > > > IBM Italia S.p.A. > Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI) > Cap. Soc. euro 347.256.998,80 > C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153 > Societ? con unico azionista > Societ? soggetta all?attivit? di direzione e coordinamento di > International Business Machines Corporation > > (Salvo che sia diversamente indicato sopra / Unless stated otherwise above) > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- Zach Giles zgiles at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 10661 bytes Desc: not available URL: From chekh at stanford.edu Thu Jun 18 21:26:17 2015 From: chekh at stanford.edu (Alex Chekholko) Date: Thu, 18 Jun 2015 13:26:17 -0700 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <55832969.4050901@stanford.edu> mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu From ewahl at osc.edu Thu Jun 18 21:36:48 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Thu, 18 Jun 2015 20:36:48 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <55832969.4050901@stanford.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <55832969.4050901@stanford.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58A6A@CIO-KRC-D1MBX02.osuad.osu.edu> I'm not sure it's so uncommon, but yes. (and your line looks suspiciously like mine did) I've had other situations where it would have been nice to do maintenance on a single storage pool. Maybe this is a "scale" issue when you get too large and should maybe have multiple file systems instead? Single name space is nice for users though. Plus I was curious what others had done in similar situations. I guess I could do what IBM does and just write the stupid script, name it "ts-something" and put a happy wrapper up front with a mm-something name. ;) Just FYI: 'suspend' does NOT stop I/O. Only stops new block creation,so 'stop' was what I did. >From the man page: "...Existing data on a suspended disk may still be read or updated." Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Alex Chekholko [chekh at stanford.edu] Sent: Thursday, June 18, 2015 4:26 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk suspend # seems pretty simple to me Then I guess you also have to modify your policy rules which relate to that pool. You're asking for a convenience wrapper script for a super-uncommon situation? On 06/18/2015 09:02 AM, Zachary Giles wrote: > I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner. -- Alex Chekholko chekh at stanford.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Thu Jun 18 22:01:01 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 18 Jun 2015 17:01:01 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: From stijn.deweirdt at ugent.be Fri Jun 19 08:18:31 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 09:18:31 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <5583C247.1090609@ugent.be> > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. just this week we had an issue with bad disk (of the non-failing but disrupting everything kind) and issues with the raid controller (db of both controllers corrupted due to the one disk, controller reboot loops etc etc). but tech support pulled it through, although it took a while. i'm amased what can be done with the hardware controllers (and i've seen my share of recoveries ;) my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). stijn > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From Luke.Raimbach at crick.ac.uk Fri Jun 19 08:47:10 2015 From: Luke.Raimbach at crick.ac.uk (Luke Raimbach) Date: Fri, 19 Jun 2015 07:47:10 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: <5583C247.1090609@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <5583C247.1090609@ugent.be> Message-ID: my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse). Ooh, we have a new one that's not in production yet. IBM say the latest GSS code should allow for a whole enclosure failure. I might try it before going in to production. The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. From S.J.Thompson at bham.ac.uk Fri Jun 19 14:31:17 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:31:17 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Do you mean in GNR compared to using (san/IB) based hardware RAIDs? If so, then GNR isn?t a scale-out solution - you buy a ?unit? and can add another ?unit? to the namespace, but I can?t add another 30TB of storage (say a researcher with a grant), where as with SAN based RAID controllers, I can go off and buy another storage shelf. Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Thursday, 18 June 2015 22:01 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Jun 19 14:51:32 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 09:51:32 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: 1. YES, Native Raid can recover from various failures: drawers, cabling, controllers, power supplies, etc, etc. Of course it must be configured properly so that there is no possible single point of failure. But yes, you should get your hands on a test rig and try out (simulate) various failure scenarios and see how well it works. 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. 3. If you'd like to know and/or explore more, read the pubs, do the experiments, and/or contact the IBM sales and support people. IF by some chance you do not get satisfactory answers, come back here perhaps we can get your inquiries addressed by the GPFS design team. Like other complex products, there are bound to be some questions that the sales and marketing people can't quite address. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Jun 19 14:56:33 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 13:56:33 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Jun 19 15:37:13 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 19 Jun 2015 14:37:13 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Fri Jun 19 15:49:32 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 14:49:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <201506191450.t5JEovUS018695@d01av01.pok.ibm.com> GNR today is only sold as a packaged solution e.g. ESS. The reason its not sold as SW only today is technical and its not true that this is not been pursued, its just not there yet and we cant discuss plans on a mailinglist. Sven Sent from IBM Verse Wahl, Edward --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Wahl, Edward" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:41 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now). Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others) Ed From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk] Sent: Friday, June 19, 2015 9:56 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group). So under (3), I?m posting here asking if anyone from IBM knows anything different? Thanks Simon From: Marc A Kaplan Reply-To: gpfsug main discussion list Date: Friday, 19 June 2015 14:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? 2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations, comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgiles at gmail.com Fri Jun 19 15:56:19 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 10:56:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com From jtucker at pixitmedia.com Fri Jun 19 16:05:24 2015 From: jtucker at pixitmedia.com (Jez Tucker (Chair)) Date: Fri, 19 Jun 2015 16:05:24 +0100 Subject: [gpfsug-discuss] Handing over chair@ Message-ID: <55842FB4.9030705@gpfsug.org> Hello all This is my last post as Chair for the foreseeable future. The next will come from Simon Thompson who assumes the post today for the next two years. I'm looking forward to Simon's tenure and wish him all the best with his endeavours. Myself, I'm moving over to UG Media Rep and will continue to support the User Group and committee in its efforts. My new email is jez.tucker at gpfsug.org Please keep sending through your City and Country locations, they're most helpful. Have a great weekend. All the best, Jez -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oehmes at us.ibm.com Fri Jun 19 16:09:41 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Fri, 19 Jun 2015 15:09:41 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: Message-ID: <201506191510.t5JFAdW2021605@d03av04.boulder.ibm.com> Reporting is not the issue, one of the main issue is that we can't talk to the enclosure, which results in loosing the capability to replace disk drive or turn any fault indicators on. It also prevents us to 'read' the position of a drive within a tray or fault domain within a enclosure, without that information we can't properly determine where we need to place strips of a track to prevent data access loss in case a enclosure or component fails. Sven Sent from IBM Verse Zachary Giles --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? --- From:"Zachary Giles" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:56 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Fri Jun 19 16:15:44 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:15:44 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: > I think it's technically possible to run GNR on unsupported trays. You > may have to do some fiddling with some of the scripts, and/or you wont > get proper reporting. > Of course it probably violates 100 licenses etc etc etc. > I don't know of anyone who's done it yet. I'd like to do it.. I think > it would be great to learn it deeper by doing this. > One imagines that GNR uses the SCSI enclosure services to talk to the shelves. https://en.wikipedia.org/wiki/SCSI_Enclosure_Services https://en.wikipedia.org/wiki/SES-2_Enclosure_Management Which would suggest that anything that supported these would work. I did some experimentation with a spare EXP810 shelf a few years ago on a FC-AL on Linux. Kind all worked out the box. The other experiment with an EXP100 didn't work so well; with the EXP100 it would only work with the 250GB and 400GB drives that came with the dam thing. With the EXP810 I could screw random SATA drives into it and it all worked. My investigations concluded that the firmware on the EXP100 shelf determined if the drive was supported, but I could not work out how to upload modified firmware to the shelf. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From stijn.deweirdt at ugent.be Fri Jun 19 16:23:18 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 17:23:18 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <558433E6.8030708@ugent.be> hi marc, > 1. YES, Native Raid can recover from various failures: drawers, cabling, > controllers, power supplies, etc, etc. > Of course it must be configured properly so that there is no possible > single point of failure. hmmm, this is not really what i was asking about. but maybe it's easier in gss to do this properly (eg for 8+3 data protection, you only need 11 drawers if you can make sure the data+parity blocks are send to different drawers (sort of per drawer failure group, but internal to the vdisks), and the smallest setup is a gss24 which has 20 drawers). but i can't rememeber any manual suggestion the admin can control this (or is it the default?). anyway, i'm certainly interested in any config whitepapers or guides to see what is required for such setup. are these public somewhere? (have really searched for them). > > But yes, you should get your hands on a test rig and try out (simulate) > various failure scenarios and see how well it works. is there a way besides presales to get access to such setup? stijn > > 2. I don't know the details of the packaged products, but I believe you > can license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire > or need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > 3. If you'd like to know and/or explore more, read the pubs, do the > experiments, and/or contact the IBM sales and support people. > IF by some chance you do not get satisfactory answers, come back here > perhaps we can get your inquiries addressed by the > GPFS design team. Like other complex products, there are bound to be some > questions that the sales and marketing people > can't quite address. > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > From jonathan at buzzard.me.uk Fri Jun 19 16:35:32 2015 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Fri, 19 Jun 2015 16:35:32 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <558433E6.8030708@ugent.be> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> Message-ID: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> On Fri, 2015-06-19 at 17:23 +0200, Stijn De Weirdt wrote: > hi marc, > > > > 1. YES, Native Raid can recover from various failures: drawers, cabling, > > controllers, power supplies, etc, etc. > > Of course it must be configured properly so that there is no possible > > single point of failure. > hmmm, this is not really what i was asking about. but maybe it's easier > in gss to do this properly (eg for 8+3 data protection, you only need 11 > drawers if you can make sure the data+parity blocks are send to > different drawers (sort of per drawer failure group, but internal to the > vdisks), and the smallest setup is a gss24 which has 20 drawers). > but i can't rememeber any manual suggestion the admin can control this > (or is it the default?). > I got the impression that GNR was more in line with the Engenio dynamic disk pools http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf That is traditional RAID sucks with large numbers of big drives. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From bsallen at alcf.anl.gov Fri Jun 19 17:05:15 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 16:05:15 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. Ben > On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: > > On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >> I think it's technically possible to run GNR on unsupported trays. You >> may have to do some fiddling with some of the scripts, and/or you wont >> get proper reporting. >> Of course it probably violates 100 licenses etc etc etc. >> I don't know of anyone who's done it yet. I'd like to do it.. I think >> it would be great to learn it deeper by doing this. >> > > One imagines that GNR uses the SCSI enclosure services to talk to the > shelves. > > https://en.wikipedia.org/wiki/SCSI_Enclosure_Services > https://en.wikipedia.org/wiki/SES-2_Enclosure_Management > > Which would suggest that anything that supported these would work. > > I did some experimentation with a spare EXP810 shelf a few years ago on > a FC-AL on Linux. Kind all worked out the box. The other experiment with > an EXP100 didn't work so well; with the EXP100 it would only work with > the 250GB and 400GB drives that came with the dam thing. With the EXP810 > I could screw random SATA drives into it and it all worked. My > investigations concluded that the firmware on the EXP100 shelf > determined if the drive was supported, but I could not work out how to > upload modified firmware to the shelf. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From peserocka at gmail.com Fri Jun 19 17:09:44 2015 From: peserocka at gmail.com (Pete Sero) Date: Sat, 20 Jun 2015 00:09:44 +0800 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: vi my_enclosures.conf fwiw Peter On 2015 Jun 20 Sat, at 24:05, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 17:12:53 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 12:12:53 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk> Message-ID: Ya, that's why I mentioned you'd probably have to fiddle with some scripts or something to help GNR figure out where disks are. Is definitely known that you can't just use any random enclosure given that GNR depends highly on the topology. Maybe in the future there would be a way to specify the topology or that a drive is at a specific position. On Fri, Jun 19, 2015 at 12:05 PM, Allen, Benjamin S. wrote: >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. > > It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way. > > So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology. > > Ben > >> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote: >> >> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote: >>> I think it's technically possible to run GNR on unsupported trays. You >>> may have to do some fiddling with some of the scripts, and/or you wont >>> get proper reporting. >>> Of course it probably violates 100 licenses etc etc etc. >>> I don't know of anyone who's done it yet. I'd like to do it.. I think >>> it would be great to learn it deeper by doing this. >>> >> >> One imagines that GNR uses the SCSI enclosure services to talk to the >> shelves. >> >> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services >> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management >> >> Which would suggest that anything that supported these would work. >> >> I did some experimentation with a spare EXP810 shelf a few years ago on >> a FC-AL on Linux. Kind all worked out the box. The other experiment with >> an EXP100 didn't work so well; with the EXP100 it would only work with >> the 250GB and 400GB drives that came with the dam thing. With the EXP810 >> I could screw random SATA drives into it and it all worked. My >> investigations concluded that the firmware on the EXP100 shelf >> determined if the drive was supported, but I could not work out how to >> upload modified firmware to the shelf. >> >> JAB. >> >> -- >> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk >> Fife, United Kingdom. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From makaplan at us.ibm.com Fri Jun 19 19:45:19 2015 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 19 Jun 2015 14:45:19 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stijn.deweirdt at ugent.be Fri Jun 19 20:01:04 2015 From: stijn.deweirdt at ugent.be (Stijn De Weirdt) Date: Fri, 19 Jun 2015 21:01:04 +0200 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be> <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk> Message-ID: <558466F0.8000300@ugent.be> >>> 1. YES, Native Raid can recover from various failures: drawers, cabling, >>> controllers, power supplies, etc, etc. >>> Of course it must be configured properly so that there is no possible >>> single point of failure. >> hmmm, this is not really what i was asking about. but maybe it's easier >> in gss to do this properly (eg for 8+3 data protection, you only need 11 >> drawers if you can make sure the data+parity blocks are send to >> different drawers (sort of per drawer failure group, but internal to the >> vdisks), and the smallest setup is a gss24 which has 20 drawers). >> but i can't rememeber any manual suggestion the admin can control this >> (or is it the default?). >> > > I got the impression that GNR was more in line with the Engenio dynamic > disk pools well, it's uses some crush-like placement and some parity encoding scheme (regular raid6 for the DDP, some flavour of EC for GNR), but other then that, not much resemblence. DDP does not give you any control over where the data blocks are stored. i'm not sure about GNR, (but DDP does not state anywhere they are drawer failure proof ;). but GNR is more like a DDP then e.g. a ceph EC pool, in the sense that the hosts needs to see all disks (similar to the controller that needs access to the disks). > > http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx > > http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf > > That is traditional RAID sucks with large numbers of big drives. (btw it's one of those that we saw fail (and get recovered by tech support!) this week. tip of the week: turn on the SMmonitor service on at least one host, it's actually useful for something). stijn > > > JAB. > From S.J.Thompson at bham.ac.uk Fri Jun 19 20:17:32 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 19:17:32 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> , <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>, Message-ID: My understanding I that GSS and IBM ESS are sold as pre configured systems. So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] Sent: 19 June 2015 19:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? OOps... here is the official statement: GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? From zgiles at gmail.com Fri Jun 19 21:08:14 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 16:08:14 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From S.J.Thompson at bham.ac.uk Fri Jun 19 22:08:25 2015 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Fri, 19 Jun 2015 21:08:25 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> , Message-ID: I'm not disputing that gnr is a cool technology. Just that as scale out, it doesn't work for our funding model. If we go back to the original question, if was pros and cons of gnr vs raid type storage. My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). Simon ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] Sent: 19 June 2015 21:08 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? It's comparable to other "large" controller systems. Take the DDN 10K/12K for example: You don't just buy one more shelf of disks, or 5 disks at a time from Walmart. You buy 5, 10, or 20 trays and populate enough disks to either hit your bandwidth or storage size requirement. Generally changing from 5 to 10 to 20 requires support to come on-site and recable it, and generally you either buy half or all the disks slots worth of disks. The whole system is a building block and you buy N of them to get up to 10-20PB of storage. GSS is the same way, there are a few models and you just buy a packaged one. Technically, you can violate the above constraints, but then it may not work well and you probably can't buy it that way. I'm pretty sure DDN's going to look at you funny if you try to buy a 12K with 30 drives.. :) For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with buildin RAID, a pair of servers, and forget GNR. Or maybe GSS22? :) >From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 " Current high-density storage Models 24 and 26 remain available Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available 200 GB and 800 GB SSDs are also available The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, 26s is comprised of SSD drives or 1.2 TB hard SAS drives " On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - IT Services) wrote: > > My understanding I that GSS and IBM ESS are sold as pre configured systems. > > So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. > > So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. > > So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] > Sent: 19 June 2015 19:45 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > OOps... here is the official statement: > > GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. > > I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Fri Jun 19 22:18:51 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:18:51 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: References: Message-ID: <5584873B.1080109@yale.edu> Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group From zgiles at gmail.com Fri Jun 19 22:35:59 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 17:35:59 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: OK, back on topic: Honestly, I'm really glad you said that. I have that exact problem also -- a researcher will be funded for xTB of space, and we are told by the grants office that if something is purchased on a grant it belongs to them and it should have a sticker put on it that says "property of the govt' etc etc. We decided to (as an institution) put the money forward to purchase a large system ahead of time, and as grants come in, recover the cost back into the system by paying off our internal "negative balance". In this way we can get the benefit of a large storage system like performance and purchasing price, but provision storage into quotas as needed. We can even put stickers on a handful of drives in the GSS tray if that makes them feel happy. Could they request us to hand over their drives and take them out of our system? Maybe. if the Grants Office made us do it, sure, I'd drain some pools off and go hand them over.. but that will never happen because it's more valuable to them in our cluster than sitting on their table, and I'm not going to deliver the drives full of their data. That's their responsibility. Is it working? Yeah, but, I'm not a grants admin nor an accountant, so I'll let them figure that out, and they seem to be OK with this model. And yes, it's not going to work for all institutions unless you can put the money forward upfront, or do a group purchase at the end of a year. So I 100% agree, GNR doesn't really fit the model of purchasing a few drives at a time, and the grants things is still a problem. On Fri, Jun 19, 2015 at 5:08 PM, Simon Thompson (Research Computing - IT Services) wrote: > I'm not disputing that gnr is a cool technology. > > Just that as scale out, it doesn't work for our funding model. > > If we go back to the original question, if was pros and cons of gnr vs raid type storage. > > My point was really that I have research groups who come along and want to by xTb at a time. And that's relatively easy with a raid/san based approach. And at times that needs to be a direct purchase from our supplier based on the grant rather than an internal recharge. > > And the overhead of a smaller gss (twin servers) is much higher cost compared to a storewise tray. I'm also not really advocating that its arbitrary storage. Just saying id really like to see shelf at a time upgrades for it (and supports shelf only). > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Zachary Giles [zgiles at gmail.com] > Sent: 19 June 2015 21:08 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chris.hunter at yale.edu Fri Jun 19 22:57:14 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Fri, 19 Jun 2015 17:57:14 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: References: Message-ID: <5584903A.3020203@yale.edu> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. A JBOD solution that allows incremental drive expansion is desirable. chris hunter yale hpc group > From: Zachary Giles > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > OK, back on topic: > Honestly, I'm really glad you said that. I have that exact problem > also -- a researcher will be funded for xTB of space, and we are told > by the grants office that if something is purchased on a grant it > belongs to them and it should have a sticker put on it that says > "property of the govt' etc etc. > We decided to (as an institution) put the money forward to purchase a > large system ahead of time, and as grants come in, recover the cost > back into the system by paying off our internal "negative balance". In > this way we can get the benefit of a large storage system like > performance and purchasing price, but provision storage into quotas as > needed. We can even put stickers on a handful of drives in the GSS > tray if that makes them feel happy. > Could they request us to hand over their drives and take them out of > our system? Maybe. if the Grants Office made us do it, sure, I'd drain > some pools off and go hand them over.. but that will never happen > because it's more valuable to them in our cluster than sitting on > their table, and I'm not going to deliver the drives full of their > data. That's their responsibility. > > Is it working? Yeah, but, I'm not a grants admin nor an accountant, so > I'll let them figure that out, and they seem to be OK with this model. > And yes, it's not going to work for all institutions unless you can > put the money forward upfront, or do a group purchase at the end of a > year. > > So I 100% agree, GNR doesn't really fit the model of purchasing a few > drives at a time, and the grants things is still a problem. From jhick at lbl.gov Fri Jun 19 23:18:56 2015 From: jhick at lbl.gov (Jason Hick) Date: Fri, 19 Jun 2015 15:18:56 -0700 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <5584903A.3020203@yale.edu> References: <5584903A.3020203@yale.edu> Message-ID: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. As opposed to dealing with racks of storage and architectural details. Jason > On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: > > I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. > > I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. > > We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. > > A JBOD solution that allows incremental drive expansion is desirable. > > chris hunter > yale hpc group > >> From: Zachary Giles >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >> themselves? How about GPFS Native Raid? >> >> OK, back on topic: >> Honestly, I'm really glad you said that. I have that exact problem >> also -- a researcher will be funded for xTB of space, and we are told >> by the grants office that if something is purchased on a grant it >> belongs to them and it should have a sticker put on it that says >> "property of the govt' etc etc. >> We decided to (as an institution) put the money forward to purchase a >> large system ahead of time, and as grants come in, recover the cost >> back into the system by paying off our internal "negative balance". In >> this way we can get the benefit of a large storage system like >> performance and purchasing price, but provision storage into quotas as >> needed. We can even put stickers on a handful of drives in the GSS >> tray if that makes them feel happy. >> Could they request us to hand over their drives and take them out of >> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >> some pools off and go hand them over.. but that will never happen >> because it's more valuable to them in our cluster than sitting on >> their table, and I'm not going to deliver the drives full of their >> data. That's their responsibility. >> >> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >> I'll let them figure that out, and they seem to be OK with this model. >> And yes, it's not going to work for all institutions unless you can >> put the money forward upfront, or do a group purchase at the end of a >> year. >> >> So I 100% agree, GNR doesn't really fit the model of purchasing a few >> drives at a time, and the grants things is still a problem. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Fri Jun 19 23:54:39 2015 From: zgiles at gmail.com (Zachary Giles) Date: Fri, 19 Jun 2015 18:54:39 -0400 Subject: [gpfsug-discuss] How about GPFS Native Raid? In-Reply-To: <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> References: <5584903A.3020203@yale.edu> <685C1753-EB0F-4B45-9D8B-EF28CDB6FD39@lbl.gov> Message-ID: Starting to sound like Seagate/Xyratex there. :) On Fri, Jun 19, 2015 at 6:18 PM, Jason Hick wrote: > For the same reason (storage expansions that follow funding needs), I want a 4 or 5U embedded server/JBOD with GNR. That would allow us to simply plugin the host interfaces (2-4 of them), configure an IP addr/host name and add it as NSDs to an existing GPFS file system. > > As opposed to dealing with racks of storage and architectural details. > > Jason > >> On Jun 19, 2015, at 2:57 PM, Chris Hunter wrote: >> >> I'll 2nd Zach on this. The storage funding model vs the storage purchase model are a challenge. >> >> I should also mention often research grant funding can't be used to buy a storage "service" without additional penalties. So S3 or private storage cloud are not financially attractive. >> >> We used to have a "pay it forward" model where an investigator would buy ~10 drive batches, which sat on a shelf until we accumulated sufficient drives to fill a new enclosure. Interim, we would allocate storage from existing infrastructure to fulfill the order. >> >> A JBOD solution that allows incremental drive expansion is desirable. >> >> chris hunter >> yale hpc group >> >>> From: Zachary Giles >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by >>> themselves? How about GPFS Native Raid? >>> >>> OK, back on topic: >>> Honestly, I'm really glad you said that. I have that exact problem >>> also -- a researcher will be funded for xTB of space, and we are told >>> by the grants office that if something is purchased on a grant it >>> belongs to them and it should have a sticker put on it that says >>> "property of the govt' etc etc. >>> We decided to (as an institution) put the money forward to purchase a >>> large system ahead of time, and as grants come in, recover the cost >>> back into the system by paying off our internal "negative balance". In >>> this way we can get the benefit of a large storage system like >>> performance and purchasing price, but provision storage into quotas as >>> needed. We can even put stickers on a handful of drives in the GSS >>> tray if that makes them feel happy. >>> Could they request us to hand over their drives and take them out of >>> our system? Maybe. if the Grants Office made us do it, sure, I'd drain >>> some pools off and go hand them over.. but that will never happen >>> because it's more valuable to them in our cluster than sitting on >>> their table, and I'm not going to deliver the drives full of their >>> data. That's their responsibility. >>> >>> Is it working? Yeah, but, I'm not a grants admin nor an accountant, so >>> I'll let them figure that out, and they seem to be OK with this model. >>> And yes, it's not going to work for all institutions unless you can >>> put the money forward upfront, or do a group purchase at the end of a >>> year. >>> >>> So I 100% agree, GNR doesn't really fit the model of purchasing a few >>> drives at a time, and the grants things is still a problem. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From bsallen at alcf.anl.gov Sat Jun 20 00:12:53 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Fri, 19 Jun 2015 23:12:53 +0000 Subject: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? In-Reply-To: <5584873B.1080109@yale.edu> References: , <5584873B.1080109@yale.edu> Message-ID: <3a261dc3-e8a4-4550-bab2-db4cc0ffbaea@alcf.anl.gov> Let me know what specific questions you have. Ben From: Chris Hunter Sent: Jun 19, 2015 4:18 PM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by, themselves? Smells of troll bait but I'll bite. "Declustered RAID" certainly has benefits for recovery of failed disks but I don't think it claims performance benefits over traditional RAID. GNR certainly has a large memory footprint. Object RAID is a close cousin that has flexbile expansion capability, depending on product packaging GNR could likely match these features. Argonne labs (Illinois USA) has done a lot with both GNR and RAID GPFS, I would be interested in their experiences. > Date: Thu, 18 Jun 2015 17:01:01 -0400 > From: Marc A Kaplan > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? > > What do you see as the pros and cons of using GPFS Native Raid and > configuring your disk arrays as JBODs instead of using RAID in a box. chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From viccornell at gmail.com Sat Jun 20 22:12:53 2015 From: viccornell at gmail.com (Vic Cornell) Date: Sat, 20 Jun 2015 22:12:53 +0100 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Just to make sure everybody is up to date on this, (I work for DDN BTW): > On 19 Jun 2015, at 21:08, Zachary Giles wrote: > > It's comparable to other "large" controller systems. Take the DDN > 10K/12K for example: You don't just buy one more shelf of disks, or 5 > disks at a time from Walmart. You buy 5, 10, or 20 trays and populate > enough disks to either hit your bandwidth or storage size requirement. With the 12K you can buy 1,2,3,4,5,,10 or 20. With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > Generally changing from 5 to 10 to 20 requires support to come on-site > and recable it, and generally you either buy half or all the disks > slots worth of disks. You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > The whole system is a building block and you buy > N of them to get up to 10-20PB of storage. > GSS is the same way, there are a few models and you just buy a packaged one. > > Technically, you can violate the above constraints, but then it may > not work well and you probably can't buy it that way. > I'm pretty sure DDN's going to look at you funny if you try to buy a > 12K with 30 drives.. :) Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. Happy to expand on any of this on or offline. Vic > > For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save > money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with > buildin RAID, a pair of servers, and forget GNR. > Or maybe GSS22? :) > > From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 > " > Current high-density storage Models 24 and 26 remain available > Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u > JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) > 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available > 200 GB and 800 GB SSDs are also available > The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, > 26s is comprised of SSD drives or 1.2 TB hard SAS drives > " > > > On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - > IT Services) wrote: >> >> My understanding I that GSS and IBM ESS are sold as pre configured systems. >> >> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >> >> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >> >> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >> >> Simon >> ________________________________________ >> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >> Sent: 19 June 2015 19:45 >> To: gpfsug main discussion list >> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >> >> OOps... here is the official statement: >> >> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >> >> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Zach Giles > zgiles at gmail.com > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From zgiles at gmail.com Sat Jun 20 23:40:58 2015 From: zgiles at gmail.com (Zachary Giles) Date: Sat, 20 Jun 2015 18:40:58 -0400 Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? In-Reply-To: <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu> <2AF9D73B-FFE8-4EDD-A7A1-E8F8037C6A15@gmail.com> Message-ID: All true. I wasn't trying to knock DDN or say "it can't be done", it's just (probably) not very efficient or cost effective to buy a 12K with 30 drives (as an example). The new 7700 looks like a really nice base a small building block. I had forgot about them. There is a good box for adding 4U at a time, and with 60 drives per enclosure, if you saturated it out at ~3 enclosure / 180 drives, you'd have 1PB, which is also a nice round building block size. :thumb up: On Sat, Jun 20, 2015 at 5:12 PM, Vic Cornell wrote: > Just to make sure everybody is up to date on this, (I work for DDN BTW): > >> On 19 Jun 2015, at 21:08, Zachary Giles wrote: >> >> It's comparable to other "large" controller systems. Take the DDN >> 10K/12K for example: You don't just buy one more shelf of disks, or 5 >> disks at a time from Walmart. You buy 5, 10, or 20 trays and populate >> enough disks to either hit your bandwidth or storage size requirement. > > With the 12K you can buy 1,2,3,4,5,,10 or 20. > > With the 7700/Gs7K you can buy 1 ,2 ,3,4 or 5. > > GS7K comes with 2 controllers and 60 disk slots all in 4U, it saturates (with GPFS scatter) at about 160- 180 NL- SAS disks and you can concatenate as many of them together as you like. I guess the thing with GPFS is that you can pick your ideal building block and then scale with it as far as you like. > >> Generally changing from 5 to 10 to 20 requires support to come on-site >> and recable it, and generally you either buy half or all the disks >> slots worth of disks. > > You can start off with as few as 2 disks in a system . We have lots of people who buy partially populated systems and then sell on capacity to users, buying disks in groups of 10, 20 or more - thats what the flexibility of GPFS is all about, yes? > >> The whole system is a building block and you buy >> N of them to get up to 10-20PB of storage. >> GSS is the same way, there are a few models and you just buy a packaged one. >> >> Technically, you can violate the above constraints, but then it may >> not work well and you probably can't buy it that way. >> I'm pretty sure DDN's going to look at you funny if you try to buy a >> 12K with 30 drives.. :) > > Nobody at DDN is going to look at you funny if you say you want to buy something :-). We have as many different procurement strategies as we have customers. If all you can afford with your infrastructure money is 30 drives to get you off the ground and you know that researchers/users will come to you with money for capacity down the line then a 30 drive 12K makes perfect sense. > > Most configs with external servers can be made to work. The embedded (12KXE, GS7K ) are a bit more limited in how you can arrange disks and put services on NSD servers but thats the tradeoff for the smaller footprint. > > Happy to expand on any of this on or offline. > > Vic > > >> >> For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save >> money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with >> buildin RAID, a pair of servers, and forget GNR. >> Or maybe GSS22? :) >> >> From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098 >> " >> Current high-density storage Models 24 and 26 remain available >> Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u >> JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs) >> 1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available >> 200 GB and 800 GB SSDs are also available >> The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s, >> 26s is comprised of SSD drives or 1.2 TB hard SAS drives >> " >> >> >> On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing - >> IT Services) wrote: >>> >>> My understanding I that GSS and IBM ESS are sold as pre configured systems. >>> >>> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives. >>> >>> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system. >>> >>> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy. >>> >>> Simon >>> ________________________________________ >>> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com] >>> Sent: 19 June 2015 19:45 >>> To: gpfsug main discussion list >>> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? >>> >>> OOps... here is the official statement: >>> >>> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments. >>> >>> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...? >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> -- >> Zach Giles >> zgiles at gmail.com >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Zach Giles zgiles at gmail.com From chair at gpfsug.org Mon Jun 22 08:57:49 2015 From: chair at gpfsug.org (GPFS UG Chair) Date: Mon, 22 Jun 2015 08:57:49 +0100 Subject: [gpfsug-discuss] chair@GPFS UG Message-ID: Hi all, Just to follow up from Jez's email last week I'm now taking over as chair of the group. I'd like to thank Jez for his work with the group over the past couple of years in developing it to where it is now (as well as Claire who is staying on as secretary!). We're still interested in sector reps for the group, so if you are a GPFS user in a specific sector and would be interested in this, please let me know. As there haven't really been any sector reps before, we'll see how that works out, but I can't see it being a lot of work! On the US side of things, I need to catch up with Jez and Claire to see where things are up to. And finally, just as a quick head's up, we're pencilled in to have a user group mini (2hr) meeting in the UK in December as one of the breakout groups at the annual MEW event, once the dates for this are published I'll send out a save the date. If you are a user and interested in speaking, also let me know as well as anything else you might like to see there. Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Jun 22 14:04:23 2015 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 22 Jun 2015 13:04:23 +0000 Subject: [gpfsug-discuss] Placement Policy Installation andRDMConsiderations In-Reply-To: References: , Message-ID: <201506221305.t5MD5Owv014072@d01av05.pok.ibm.com> An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:28:22 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:28:22 +0100 Subject: [gpfsug-discuss] LROC Express Message-ID: <55882996.6050903@pixitmedia.com> Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. From oester at gmail.com Mon Jun 22 16:36:08 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:36:08 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882996.6050903@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> Message-ID: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: > Hi All, > > Very quick question for those in the know - does LROC require a standard > license, or will it work with Express? I can't find anything in the FAQ > regarding this so I presume Express is ok, but wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the exclusive > attention of the addressee(s) indicated. If you are not the intended > recipient, this email should not be read or disclosed to any other person. > Please notify the sender immediately and delete this email from your > computer system. Any opinions expressed are not necessarily those of the > company from which this email was sent and, whilst to the best of our > knowledge no viruses or defects exist, no responsibility can be accepted > for any loss or damage arising from its receipt or subsequent use of this > email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Mon Jun 22 16:39:49 2015 From: bevans at pixitmedia.com (Barry Evans) Date: Mon, 22 Jun 2015 16:39:49 +0100 Subject: [gpfsug-discuss] LROC Express In-Reply-To: References: <55882996.6050903@pixitmedia.com> Message-ID: <55882C45.6090501@pixitmedia.com> Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: > It works with Standard edition, just make sure you have the right > license for the nodes using LROC. > > Bob Oesterlin > Nuance COmmunications > > > Bob Oesterlin > > > On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: > > Hi All, > > Very quick question for those in the know - does LROC require a > standard license, or will it work with Express? I can't find > anything in the FAQ regarding this so I presume Express is ok, but > wanted to make sure. > > Regards, > Barry Evans > Technical Director > Pixit Media/ArcaStream > > > > -- > > This email is confidential in that it is intended for the > exclusive attention of the addressee(s) indicated. If you are not > the intended recipient, this email should not be read or disclosed > to any other person. Please notify the sender immediately and > delete this email from your computer system. Any opinions > expressed are not necessarily those of the company from which this > email was sent and, whilst to the best of our knowledge no viruses > or defects exist, no responsibility can be accepted for any loss > or damage arising from its receipt or subsequent use of this email. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oester at gmail.com Mon Jun 22 16:45:33 2015 From: oester at gmail.com (Bob Oesterlin) Date: Mon, 22 Jun 2015 10:45:33 -0500 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: I only have a Standard Edition, so I can't say for sure. I do know it's Linux x86 only. This doesn't seem to say directly either: http://www-01.ibm.com/support/knowledgecenter/SSFKCN/gpfs4104/gpfsclustersfaq.html%23lic41?lang=en Bob Oesterlin On Mon, Jun 22, 2015 at 10:39 AM, Barry Evans wrote: > Hi Bob, > > Thanks for this, just to confirm does this mean that it *does not* work > with express? > > Cheers, > Barry > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Mon Jun 22 23:57:10 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Mon, 22 Jun 2015 22:57:10 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <55882C45.6090501@pixitmedia.com> References: <55882996.6050903@pixitmedia.com> <55882C45.6090501@pixitmedia.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss [http://www.pixitmedia.com/sig/sig-cio.jpg] This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at us.ibm.com Tue Jun 23 00:14:09 2015 From: oehmes at us.ibm.com (Sven Oehme) Date: Mon, 22 Jun 2015 16:14:09 -0700 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> Message-ID: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ From: "Sanchez, Paul" To: gpfsug main discussion list Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [ mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Paul.Sanchez at deshaw.com Tue Jun 23 15:10:31 2015 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Tue, 23 Jun 2015 14:10:31 +0000 Subject: [gpfsug-discuss] LROC Express In-Reply-To: <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> References: <55882996.6050903@pixitmedia.com><55882C45.6090501@pixitmedia.com> <201D6001C896B846A9CFC2E841986AC145418312@mailnycmb2a.winmail.deshaw.com> <201506222315.t5MNF137006166@d03av02.boulder.ibm.com> Message-ID: <201D6001C896B846A9CFC2E841986AC145418BC0@mailnycmb2a.winmail.deshaw.com> Hi Sven, Yes, I think that fileset level include/exclude would be sufficient for us. It also begs the question about the same for write caching. We haven?t experimented with it yet, but are looking forward to employing HAWC for scratch-like workloads. Do you imagine providing the same sort of HAWC bypass include/exclude to be part of this? That might be useful for excluding datasets where the write ingest rate isn?t massive and the degree of risk we?re comfortable with potential data recovery issues in the face of complex outages may be much lower. Thanks, Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Sven Oehme Sent: Monday, June 22, 2015 7:14 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Paul, just out of curiosity, not that i promise anything, but would it be enough to support include/exclude per fileset level or would we need path and/or extension or even more things like owner of files as well ? Sven ------------------------------------------ Sven Oehme Scalable Storage Research email: oehmes at us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab ------------------------------------------ [Inactive hide details for "Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we]"Sanchez, Paul" ---06/22/2015 03:57:29 PM---I can?t confirm whether it works with Express, since we?re also running standard. But as a simple t From: "Sanchez, Paul" > To: gpfsug main discussion list > Date: 06/22/2015 03:57 PM Subject: Re: [gpfsug-discuss] LROC Express Sent by: gpfsug-discuss-bounces at gpfsug.org ________________________________ I can?t confirm whether it works with Express, since we?re also running standard. But as a simple test, I can confirm that deleting the gpfs.ext package doesn?t seem to throw any errors w.r.t. LROC in the logs at startup, and ?mmdiag --lroc? looks normal when running without gpfs.ext (Standard Edition package). Since we have other standard edition features enabled, I couldn?t get far enough to actually test whether the LROC was still functional though. In the earliest 4.1.0.x releases the use of LROC was confused with ?serving NSDs? and so the use of the feature required a server license, and it did throw errors at startup about that. We?ve confirmed that in recent releases that this is no longer a limitation, and that it was indeed erroneous since the goal of LROC was pretty clearly to extend the capabilities of client-side pagepool caching. LROC also appears to have some rate-limiting (queue depth mgmt?) so you end up in many cases getting partial file-caching after a first read, and a subsequent read can have a mix of blocks served from local cache and from NSD. Further reads can result in more complete local block caching of the file. One missing improvement would be to allow its use on a per-filesystem (or per-pool) basis. For instance, when backing a filesystem with a huge array of Flash or even a GSS/ESS then the performance benefit of LROC may be negligible or even negative, depending on the performance characteristics of the local disk. But against filesystems on archival media, LROC will almost always be a win. Since we?ve started to see features using tracked I/O access time to individual NSDs (e.g. readReplicaPolicy=fastest), there?s potential here to do something adaptive based on heuristics as well. Anyone else using this at scale and seeing a need for additional knobs? Thx Paul From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Barry Evans Sent: Monday, June 22, 2015 11:40 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] LROC Express Hi Bob, Thanks for this, just to confirm does this mean that it *does not* work with express? Cheers, Barry Bob Oesterlin wrote: It works with Standard edition, just make sure you have the right license for the nodes using LROC. Bob Oesterlin Nuance COmmunications Bob Oesterlin On Mon, Jun 22, 2015 at 10:28 AM, Barry Evans > wrote: Hi All, Very quick question for those in the know - does LROC require a standard license, or will it work with Express? I can't find anything in the FAQ regarding this so I presume Express is ok, but wanted to make sure. Regards, Barry Evans Technical Director Pixit Media/ArcaStream -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From ewahl at osc.edu Tue Jun 23 15:11:11 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 23 Jun 2015 14:11:11 +0000 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> References: , <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> FYI this page causes problems with various versions of Chrome and Firefox (too lazy to test other browsers, sorry) Seems to be a javascript issue. Huge surprise, right? I've filed bugs on the browser sides for FF, don't care about chrome sorry. Ed Wahl OSC ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ewahl at osc.edu] Sent: Monday, June 15, 2015 4:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] 4.1.1 fix central location When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :( Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product) or what. Ed ________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com] Sent: Monday, June 15, 2015 12:43 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] 4.1.1 fix central location Hi IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc. However, I initially had trouble finding the PTF - rest assured it does exist. You can find the 4.1.1 main download here: http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade: https://scwebtestt.rchland.ibm.com/support/fixcentral/ 1) Set Product Group --> Systems Storage 2) Set Systems Storage --> Storage software 3) Set Storage Software --> Software defined storage 4) Installed Version defaults to 4.1.1 5) Select your platform If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome. Best regards, Ross Keeping IBM Spectrum Scale - Development Manager, People Manager IBM Systems UK - Manchester Development Lab Phone: (+44 161) 8362381-Line: 37642381 E-mail: ross.keeping at uk.ibm.com [IBM] 3rd Floor, Maybrook House Manchester, M3 2EG United Kingdom Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: ATT00001.gif URL: From oester at gmail.com Tue Jun 23 15:16:10 2015 From: oester at gmail.com (Bob Oesterlin) Date: Tue, 23 Jun 2015 09:16:10 -0500 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu> <9DA9EC7A281AC7428A9618AFDC49049955A59AF2@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: Try here: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all Bob Oesterlin On Tue, Jun 23, 2015 at 9:11 AM, Wahl, Edward wrote: > FYI this page causes problems with various versions of Chrome and > Firefox (too lazy to test other browsers, sorry) Seems to be a javascript > issue. Huge surprise, right? > > I've filed bugs on the browser sides for FF, don't care about chrome > sorry. > > Ed Wahl > OSC > > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Wahl, Edward [ > ewahl at osc.edu] > *Sent:* Monday, June 15, 2015 4:35 PM > *To:* gpfsug main discussion list > *Subject:* Re: [gpfsug-discuss] 4.1.1 fix central location > > When I navigate using these instructions I can find the fixes, but > attempting to get to them at the last step results in a loop back to the > SDN screen. :( > > Not sure if this is the page, lack of the "proper" product in my supported > products (still lists 3.5 as our product) > or what. > > Ed > > ------------------------------ > *From:* gpfsug-discuss-bounces at gpfsug.org [ > gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ > ross.keeping at uk.ibm.com] > *Sent:* Monday, June 15, 2015 12:43 PM > *To:* gpfsug-discuss at gpfsug.org > *Subject:* [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central you > will likely be disappointed. Work is ongoing to ensure this becomes more > intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381 > *-Line:* 37642381 > * E-mail: *ross.keeping at uk.ibm.com > [image: IBM] > 3rd Floor, Maybrook House > Manchester, M3 2EG > United Kingdom > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 360 bytes Desc: not available URL: From orlando.richards at ed.ac.uk Wed Jun 24 12:27:25 2015 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Wed, 24 Jun 2015 12:27:25 +0100 Subject: [gpfsug-discuss] 4.1.1 fix central location In-Reply-To: References: Message-ID: <558A941D.2010206@ed.ac.uk> Hi all, I'm looking to deploy to RedHat 7.1, but from the GPFS FAQ only versions 4.1.1 and 3.5.0-26 are supported. I can't see a release of 3.5.0-26 on the fix central website - does anyone know if this is available? Will 3.5.0-25 work okay on RH7.1? How about 4.1.0-x - any plans to support that on RH7.1? ------- Orlando. On 16/06/15 09:11, Simon Thompson (Research Computing - IT Services) wrote: > The docs also now seem to be in Spectrum Scale section at: > > http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html > > Simon > > From: Ross Keeping3 > > Reply-To: gpfsug main discussion list > > Date: Monday, 15 June 2015 17:43 > To: "gpfsug-discuss at gpfsug.org " > > > Subject: [gpfsug-discuss] 4.1.1 fix central location > > Hi > > IBM successfully released 4.1.1 on Friday with the Spectrum Scale > re-branding and introduction of protocols etc. > > However, I initially had trouble finding the PTF - rest assured it does > exist. > > You can find the 4.1.1 main download here: > http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048 > > There is a new area in Fix Central that you will need to navigate to get > the PTF for upgrade: > https://scwebtestt.rchland.ibm.com/support/fixcentral/ > 1) Set Product Group --> Systems Storage > 2) Set Systems Storage --> Storage software > 3) Set Storage Software --> Software defined storage > 4) Installed Version defaults to 4.1.1 > 5) Select your platform > > If you try and find the PTF via other links or sections of Fix Central > you will likely be disappointed. Work is ongoing to ensure this becomes > more intuitive - any thoughts for improvements always welcome. > > Best regards, > > *Ross Keeping* > IBM Spectrum Scale - Development Manager, People Manager > IBM Systems UK - Manchester Development Lab > *Phone:*(+44 161) 8362381*-Line:*37642381* > E-mail: *ross.keeping at uk.ibm.com > IBM > > 3rd Floor, Maybrook House > > Manchester, M3 2EG > > United Kingdom > > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- -- Dr Orlando Richards Research Services Manager Information Services IT Infrastructure Division Tel: 0131 650 4994 skype: orlando.richards The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From chris.hunter at yale.edu Wed Jun 24 18:26:11 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Wed, 24 Jun 2015 13:26:11 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Message-ID: <558AE833.6070803@yale.edu> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group From ewahl at osc.edu Wed Jun 24 18:47:19 2015 From: ewahl at osc.edu (Wahl, Edward) Date: Wed, 24 Jun 2015 17:47:19 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5A174@CIO-KRC-D1MBX02.osuad.osu.edu> Both are available to you directly. in Linux anyway. My AIX knowledge is decades old. And yes, the HBAs have much more availability/data of course. What kind of monitoring are you looking to do? Fault? Take the data and ?? nagios/cactii/ganglia/etc? Mine it with Splunk? Expand the GPFS Monitor suite? sourceforge.net/projects/gpfsmonitorsuite (though with sourceforge lately, perhaps we should ask Pam et al. to move them?) Ed Wahl OSC ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Chris Hunter [chris.hunter at yale.edu] Sent: Wednesday, June 24, 2015 1:26 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. regards, chris hunter yale hpc group _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From bsallen at alcf.anl.gov Wed Jun 24 18:48:47 2015 From: bsallen at alcf.anl.gov (Allen, Benjamin S.) Date: Wed, 24 Jun 2015 17:48:47 +0000 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: <558AE833.6070803@yale.edu> References: <558AE833.6070803@yale.edu> Message-ID: <64F35432-4DFC-452C-8965-455BCF7E2F09@alcf.anl.gov> Checkout https://github.com/leibler/check_mk-sas2ircu. This is obviously check_mk specific, but a reasonable example. Ben > On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: > > Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? > > We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. > > regards, > chris hunter > yale hpc group > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From chris.hunter at yale.edu Mon Jun 29 17:07:12 2015 From: chris.hunter at yale.edu (Chris Hunter) Date: Mon, 29 Jun 2015 12:07:12 -0400 Subject: [gpfsug-discuss] monitoring GSS enclosure and HBAs In-Reply-To: References: Message-ID: <55916D30.3050709@yale.edu> Thanks for the info. We settled on a simpler perl wrapper around sas2ircu form nagios exchange. chris hunter yale hpc group > Checkout https://github.com/leibler/check_mk-sas2ircu This is obviously check_mk specific, but a reasonable example. Ben >> On Jun 24, 2015, at 12:26 PM, Chris Hunter wrote: >> >> Can anyone offer suggestions for monitoring GSS hardware? Particularly the storage HBAs and JBOD enclosures ? >> >> We have monitoring tools via gpfs but we are seeking an alternative approach to monitor hardware independent of gpfs. >> >> regards, >> chris hunter >> yale hpc group From st.graf at fz-juelich.de Tue Jun 30 07:54:18 2015 From: st.graf at fz-juelich.de (Graf, Stephan) Date: Tue, 30 Jun 2015 06:54:18 +0000 Subject: [gpfsug-discuss] ESS/GSS GUI (Monitoring) Message-ID: <38A0607912A90F4880BDE29022E093054087CF1A@MBX2010-E01.ad.fz-juelich.de> Hi! If anyone is interested in a simple GUI for GSS/ESS we have one developed for our own (in the time when there was no GUI available). It is java based and the only requirement is to have passwordless access to the GSS nodes. (We start the GUI on our xCAT server). I have uploaded some screenshots: https://www.dropbox.com/sh/44kln4h7wgp18uu/AADsllhSxOdIeWtkNSaftu8Sa?dl=0 Stephan ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Vogel at abcsystems.ch Tue Jun 30 08:49:10 2015 From: Daniel.Vogel at abcsystems.ch (Daniel Vogel) Date: Tue, 30 Jun 2015 07:49:10 +0000 Subject: [gpfsug-discuss] GPFS 4.1.1 without QoS for mmrestripefs? Message-ID: <2CDF270206A255459AC4FA6B08E52AF90114634DD0@ABCSYSEXC1.abcsystems.ch> Hi Years ago, IBM made some plan to do a implementation "QoS for mmrestripefs, mmdeldisk...". If a "mmfsrestripe" is running, very poor performance for NFS access. I opened a PMR to ask for QoS in version 4.1.1 (Spectrum Scale). PMR 61309,113,848: I discussed the question of QOS with the development team. These command changes that were noticed are not meant to be used as GA code which is why they are not documented. I cannot provide any further information from the support perspective. Anybody knows about QoS? The last hope was at "GPFS Workshop Stuttgart M?rz 2015" with Sven Oehme as speaker. Daniel Vogel IT Consultant ABC SYSTEMS AG Hauptsitz Z?rich R?tistrasse 28 CH - 8952 Schlieren T +41 43 433 6 433 D +41 43 433 6 467 http://www.abcsystems.ch ABC - Always Better Concepts. Approved By Customers since 1981. -------------- next part -------------- An HTML attachment was scrubbed... URL: