From chair at gpfsug.org Mon Jun 8 10:06:56 2015
From: chair at gpfsug.org (Jez Tucker)
Date: Mon, 08 Jun 2015 10:06:56 +0100
Subject: [gpfsug-discuss] Election news
Message-ID: <55755B30.9070601@gpfsug.org>
Hello all
Simon Thompson (Research Computing, University of
Birmingham) has put himself
forward as sole candidate for position of Chair.
I firmly believe it in the best interest of the Group that we do not
have the same Chair indefinitely. The Group is on fine footing, so now
is an appropriate time for change.
Having spoken with Simon, the UG Committee are more than happy to
recommend him for the position of Chair for the next two years.
Over the same period, the UG Committee has proposed that I move to
represent Media (as per sector representatives) and continue to support
the efforts of the Group where appropriate.
The UG Committee would also like to recommend Ross Keeping for the IBM
non-exec position. Some of you will have met Ross at the recent User
Group. He understands the focus and needs of the Group and will act as
the group's plug-in to IBM as well as hosting the 'Meet the Devs' events
(details on the next one soon).
With respect to the above, we do not believe it is prudent to spend time
and resource on election scaffolding to vote for a single candidate.
We would suggest that if the majority of members are extremely against
this move that it is discussed openly in the mailing list. Discussion is
good!
Failing any overwhelming response to the contrary, Simon will assume
position of Chair on 19th June 2015 with the Committee?s full support.
Best regards,
Jez (Chair) and Claire (Secretary)
--------
Simon's response to the Election call follows verbatim:
a) The post they wish to stand for
Group chair
b) A paragraph covering their credentials
I have been working with GPFS for the past few years, initially in an
HPC environment and more recently using it to deliver our
research data and OpenStack platforms. The research storage platform was
developed in conjunction with OCF, our IBM business partner
which spans both spinning disk and TSM HSM layer.
I have spoken at both the UK GPFS user group and at the GPFS user forum
in the USA. In addition to this I've made a short customer
video used by IBM marketing.
Linked in profile: uk.linkedin.com/in/simonjthompson1
Blog: www.roamingzebra.co.uk
c) A paragraph covering what they would bring to the group
I already have a good working relationship with GPFS developers having
spent the past few months building our OpenStack platform
working with IBM and documenting how to use some of the features, and
would look to build on this relationship to develop the GPFS
user group.
I've also blogged many of the bits I have experimented with and would
like to see this develop with the group contributing to a wiki
style information source with specific examples of technology and configs.
In addition to this, I have support from my employer to attend meetings
and conferences and would be happy to represent and promote
the group at these as well as bringing feedback.
d) A paragraph setting out their vision for the group for the next two years
I would like to see the group engaging with more diverse users of GPFS
as many of those attending the meetings are from HPC type
environments, so I would loom to work with both IBM and resellers to
help engage with other industries using GPFS technology. Ideally
this would see more customer talks at the user group in addition to a
balanced view from IBM on road maps. I think it would also be
good to focus on specific features of gpfs and how they work and can be
applied as my suspicion is very few customers use lots of
features to full advantage.
Simon
-------- ends
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From Luke.Raimbach at crick.ac.uk Mon Jun 15 09:35:18 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Mon, 15 Jun 2015 08:35:18 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
Message-ID:
Dear All,
We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From chair at gpfsug.org Mon Jun 15 14:24:57 2015
From: chair at gpfsug.org (Jez Tucker (Chair))
Date: Mon, 15 Jun 2015 14:24:57 +0100
Subject: [gpfsug-discuss] Member locations for Dev meeting organisation
Message-ID: <557ED229.50902@gpfsug.org>
Hello all
It would be very handy if all members could send me an email to:
chair at gpfsug.org with the City and Country in which you are located.
We're looking to place 'Meet the Devs' coffee-shops close to you, so
this would make planning several orders of magnitude easier.
I can infer from each member's email, but it's only 'mostly accurate'.
Stateside members - we're actively organising a first meet up near you
imminently, so please ping me your locations.
All the best,
Jez (Chair)
From ewahl at osc.edu Mon Jun 15 14:59:44 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Mon, 15 Jun 2015 13:59:44 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur.
I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct?
Ed Wahl
OSC
________________________________
++
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk]
Sent: Monday, June 15, 2015 4:35 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] OpenStack Manila Driver
Dear All,
We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From S.J.Thompson at bham.ac.uk Mon Jun 15 15:10:25 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Mon, 15 Jun 2015 14:10:25 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
Message-ID:
Manilla is one of the projects to provide ?shared? access to file-systems.
I thought that at the moment, Manilla doesn?t support the GPFS protocol but is implemented on top of Ganesha so it provided as NFS access. So you wouldn?t get mmunlinkfileset.
This sorta brings me back to one of the things I talked about at the GPFS UG, as in the GPFS security model is trusting, which in multi-tenant environments is a bad thing. I know I?ve spoken to a few people recently who?ve commented / agreed / had thoughts on it, so can I ask that if multi-tenancy security is something that you think is of concern with GPFS, can you drop me an email (directly is fine) which your use case and what sort of thing you?d like to see, then I?ll collate this and have a go at talking to IBM again about this.
Thanks
Simon
From: , Edward >
Reply-To: gpfsug main discussion list >
Date: Monday, 15 June 2015 14:59
To: gpfsug main discussion list >
Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur.
I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct?
Ed Wahl
OSC
________________________________
++
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk]
Sent: Monday, June 15, 2015 4:35 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] OpenStack Manila Driver
Dear All,
We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jonathan at buzzard.me.uk Mon Jun 15 15:16:44 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 15 Jun 2015 15:16:44 +0100
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
Message-ID: <1434377805.15671.126.camel@buzzard.phy.strath.ac.uk>
On Mon, 2015-06-15 at 08:35 +0000, Luke Raimbach wrote:
> Dear All,
>
> We are looking forward to using the manila driver for
> auto-provisioning of file shares using GPFS. However, I have some
> concerns...
>
>
> Manila presumably gives tenant users access to file system commands
> like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset
> quiesces the file system, there is potentially an impact from one
> tenant on another - i.e. someone unlinking and deleting a lot of
> filesets during a tenancy cleanup might cause a cluster pause long
> enough to trigger other failure events or even start evicting nodes.
> You can see why this would be bad in a cloud environment.
Er as far as I can see in the documentation no you don't. My personal
experience is mmunlinkfileset has a habit of locking the file system up;
aka don't do while the file system is busy. On the other hand
mmlinkfileset you can do with gay abandonment. Might have changed in
more recent version of GPFS.
On the other hand you do get access to creating/deleting snapshots which
on the deleting side has in the past for me personally has caused file
system lockups. Similarly creating a snapshot no problem.
The difference between the two is things that require quiescence to take
away from the file system can cause bad things happen. Quiescence to add
things to the file system rarely if ever cause problems.
JAB.
--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.
From chris.hunter at yale.edu Mon Jun 15 15:35:06 2015
From: chris.hunter at yale.edu (Chris Hunter)
Date: Mon, 15 Jun 2015 10:35:06 -0400
Subject: [gpfsug-discuss] OpenStack Manila Driver
Message-ID: <557EE29A.4070909@yale.edu>
Although likely not the access model you are seeking, GPFS is mentioned
for the swift-on-file project:
* https://github.com/stackforge/swiftonfile
Openstack Swift uses HTTP/REST protocol for file access (ala S3), not
the best choice for data-intensive applications.
regards,
chris hunter
yale hpc group
---
Date: Mon, 15 Jun 2015 08:35:18 +0000
From: Luke Raimbach
To: "gpfsug-discuss at gpfsug.org"
Subject: [gpfsug-discuss] OpenStack Manila Driver
Dear All,
We are looking forward to using the manila driver for auto-provisioning
of file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands
like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset
quiesces the file system, there is potentially an impact from one tenant
on another - i.e. someone unlinking and deleting a lot of filesets
during a tenancy cleanup might cause a cluster pause long enough to
trigger other failure events or even start evicting nodes. You can see
why this would be bad in a cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
From ross.keeping at uk.ibm.com Mon Jun 15 17:43:07 2015
From: ross.keeping at uk.ibm.com (Ross Keeping3)
Date: Mon, 15 Jun 2015 17:43:07 +0100
Subject: [gpfsug-discuss] 4.1.1 fix central location
Message-ID:
Hi
IBM successfully released 4.1.1 on Friday with the Spectrum Scale
re-branding and introduction of protocols etc.
However, I initially had trouble finding the PTF - rest assured it does
exist.
You can find the 4.1.1 main download here:
http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048
There is a new area in Fix Central that you will need to navigate to get
the PTF for upgrade:
https://scwebtestt.rchland.ibm.com/support/fixcentral/
1) Set Product Group --> Systems Storage
2) Set Systems Storage --> Storage software
3) Set Storage Software --> Software defined storage
4) Installed Version defaults to 4.1.1
5) Select your platform
If you try and find the PTF via other links or sections of Fix Central you
will likely be disappointed. Work is ongoing to ensure this becomes more
intuitive - any thoughts for improvements always welcome.
Best regards,
Ross Keeping
IBM Spectrum Scale - Development Manager, People Manager
IBM Systems UK - Manchester Development Lab
Phone: (+44 161) 8362381-Line: 37642381
E-mail: ross.keeping at uk.ibm.com
3rd Floor, Maybrook House
Manchester, M3 2EG
United Kingdom
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 360 bytes
Desc: not available
URL:
From ewahl at osc.edu Mon Jun 15 21:35:04 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Mon, 15 Jun 2015 20:35:04 +0000
Subject: [gpfsug-discuss] 4.1.1 fix central location
In-Reply-To:
References:
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu>
When I navigate using these instructions I can find the fixes, but attempting to get to them at the last step results in a loop back to the SDN screen. :(
Not sure if this is the page, lack of the "proper" product in my supported products (still lists 3.5 as our product)
or what.
Ed
________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [ross.keeping at uk.ibm.com]
Sent: Monday, June 15, 2015 12:43 PM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] 4.1.1 fix central location
Hi
IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc.
However, I initially had trouble finding the PTF - rest assured it does exist.
You can find the 4.1.1 main download here:
http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048
There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade:
https://scwebtestt.rchland.ibm.com/support/fixcentral/
1) Set Product Group --> Systems Storage
2) Set Systems Storage --> Storage software
3) Set Storage Software --> Software defined storage
4) Installed Version defaults to 4.1.1
5) Select your platform
If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome.
Best regards,
Ross Keeping
IBM Spectrum Scale - Development Manager, People Manager
IBM Systems UK - Manchester Development Lab
Phone: (+44 161) 8362381-Line: 37642381
E-mail: ross.keeping at uk.ibm.com
[IBM]
3rd Floor, Maybrook House
Manchester, M3 2EG
United Kingdom
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00001.gif
Type: image/gif
Size: 360 bytes
Desc: ATT00001.gif
URL:
From oester at gmail.com Mon Jun 15 21:38:39 2015
From: oester at gmail.com (Bob Oesterlin)
Date: Mon, 15 Jun 2015 15:38:39 -0500
Subject: [gpfsug-discuss] 4.1.1 fix central location
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu>
References:
<9DA9EC7A281AC7428A9618AFDC49049955A5753D@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
It took me a while to find it too - Key is to search on "Spectrum Scale".
Try this URL:
http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%2Bdefined%2Bstorage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=4.1.1&platform=Linux+64-bit,x86_64&function=all
If you don't want X86, just select the appropriate platform.
Bob Oesterlin
Nuance Communications
On Mon, Jun 15, 2015 at 3:35 PM, Wahl, Edward wrote:
> When I navigate using these instructions I can find the fixes, but
> attempting to get to them at the last step results in a loop back to the
> SDN screen. :(
>
> Not sure if this is the page, lack of the "proper" product in my supported
> products (still lists 3.5 as our product)
> or what.
>
> Ed
>
> ------------------------------
> *From:* gpfsug-discuss-bounces at gpfsug.org [
> gpfsug-discuss-bounces at gpfsug.org] on behalf of Ross Keeping3 [
> ross.keeping at uk.ibm.com]
> *Sent:* Monday, June 15, 2015 12:43 PM
> *To:* gpfsug-discuss at gpfsug.org
> *Subject:* [gpfsug-discuss] 4.1.1 fix central location
>
> Hi
>
> IBM successfully released 4.1.1 on Friday with the Spectrum Scale
> re-branding and introduction of protocols etc.
>
> However, I initially had trouble finding the PTF - rest assured it does
> exist.
>
> You can find the 4.1.1 main download here:
> http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048
>
> There is a new area in Fix Central that you will need to navigate to get
> the PTF for upgrade:
> https://scwebtestt.rchland.ibm.com/support/fixcentral/
> 1) Set Product Group --> Systems Storage
> 2) Set Systems Storage --> Storage software
> 3) Set Storage Software --> Software defined storage
> 4) Installed Version defaults to 4.1.1
> 5) Select your platform
>
> If you try and find the PTF via other links or sections of Fix Central you
> will likely be disappointed. Work is ongoing to ensure this becomes more
> intuitive - any thoughts for improvements always welcome.
>
> Best regards,
>
> *Ross Keeping*
> IBM Spectrum Scale - Development Manager, People Manager
> IBM Systems UK - Manchester Development Lab *Phone:* (+44 161) 8362381
> *-Line:* 37642381
> * E-mail: *ross.keeping at uk.ibm.com
> [image: IBM]
> 3rd Floor, Maybrook House
> Manchester, M3 2EG
> United Kingdom
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00001.gif
Type: image/gif
Size: 360 bytes
Desc: not available
URL:
From Luke.Raimbach at crick.ac.uk Tue Jun 16 08:36:56 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Tue, 16 Jun 2015 07:36:56 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert.
The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS.
The unlinking of the fileset worries me for the reasons stated previously.
From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward
Sent: 15 June 2015 15:00
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur.
I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct?
Ed Wahl
OSC
________________________________
++
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk]
Sent: Monday, June 15, 2015 4:35 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] OpenStack Manila Driver
Dear All,
We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From S.J.Thompson at bham.ac.uk Tue Jun 16 09:11:23 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Tue, 16 Jun 2015 08:11:23 +0000
Subject: [gpfsug-discuss] 4.1.1 fix central location
In-Reply-To:
References:
Message-ID:
The docs also now seem to be in Spectrum Scale section at:
http://www-01.ibm.com/support/knowledgecenter/#!/STXKQY/411/ibmspectrumscale411_welcome.html
Simon
From: Ross Keeping3 >
Reply-To: gpfsug main discussion list >
Date: Monday, 15 June 2015 17:43
To: "gpfsug-discuss at gpfsug.org" >
Subject: [gpfsug-discuss] 4.1.1 fix central location
Hi
IBM successfully released 4.1.1 on Friday with the Spectrum Scale re-branding and introduction of protocols etc.
However, I initially had trouble finding the PTF - rest assured it does exist.
You can find the 4.1.1 main download here:
http://www-01.ibm.com/support/docview.wss?uid=isg3T4000048
There is a new area in Fix Central that you will need to navigate to get the PTF for upgrade:
https://scwebtestt.rchland.ibm.com/support/fixcentral/
1) Set Product Group --> Systems Storage
2) Set Systems Storage --> Storage software
3) Set Storage Software --> Software defined storage
4) Installed Version defaults to 4.1.1
5) Select your platform
If you try and find the PTF via other links or sections of Fix Central you will likely be disappointed. Work is ongoing to ensure this becomes more intuitive - any thoughts for improvements always welcome.
Best regards,
Ross Keeping
IBM Spectrum Scale - Development Manager, People Manager
IBM Systems UK - Manchester Development Lab
Phone:(+44 161) 8362381-Line:37642381
E-mail: ross.keeping at uk.ibm.com
[IBM]
3rd Floor, Maybrook House
Manchester, M3 2EG
United Kingdom
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00001.gif
Type: image/gif
Size: 360 bytes
Desc: ATT00001.gif
URL:
From jonathan at buzzard.me.uk Tue Jun 16 09:40:30 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 16 Jun 2015 09:40:30 +0100
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk>
On Tue, 2015-06-16 at 07:36 +0000, Luke Raimbach wrote:
[SNIP]
> The tenants don?t have root access to the file system, but the Manila
> component must act as a wrapper to file system administrative
> equivalents like mmcrfileset, mmdelfileset, link and unlink. The
> shares are created as GPFS filesets which are then presented over NFS.
>
What makes you think the it creates filesets as opposed to just sharing
out a normal directory? I had a quick peruse over the documentation and
source code and saw no mention of filesets, though I could have missed
it.
JAB.
--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.
From adam.huffman at crick.ac.uk Tue Jun 16 09:41:56 2015
From: adam.huffman at crick.ac.uk (Adam Huffman)
Date: Tue, 16 Jun 2015 08:41:56 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
The presentation of shared storage by Manila isn?t necessarily via NFS. Some of the drivers, I believe the GPFS one amongst them, allow some form of native connection either via the guest or via a VirtFS connection to the client on the hypervisor.
Best Wishes,
Adam
?
> On 16 Jun 2015, at 08:36, Luke Raimbach wrote:
>
> So as I understand things, Manila is an OpenStack component which allows tenants to create and destroy shares for their instances which would be accessed over NFS. Perhaps I?ve not done enough research in to this though ? I?m also not an OpenStack expert.
>
> The tenants don?t have root access to the file system, but the Manila component must act as a wrapper to file system administrative equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS filesets which are then presented over NFS.
>
> The unlinking of the fileset worries me for the reasons stated previously.
>
> From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward
> Sent: 15 June 2015 15:00
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
>
> Perhaps I misunderstand here, but if the tenants have administrative (ie:root) privileges to the underlying file system management commands I think mmunlinkfileset might be a minor concern here. There are FAR more destructive things that could occur.
>
> I am not an OpenStack expert and I've not even looked at anything past Kilo, but my understanding was that these commands were not necessary for tenants. They access a virtual block device that backs to GPFS, correct?
>
> Ed Wahl
> OSC
>
>
>
> ++
> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk]
> Sent: Monday, June 15, 2015 4:35 AM
> To: gpfsug-discuss at gpfsug.org
> Subject: [gpfsug-discuss] OpenStack Manila Driver
>
> Dear All,
>
> We are looking forward to using the manila driver for auto-provisioning of file shares using GPFS. However, I have some concerns...
>
> Manila presumably gives tenant users access to file system commands like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the file system, there is potentially an impact from one tenant on another - i.e. someone unlinking and deleting a lot of filesets during a tenancy cleanup might cause a cluster pause long enough to trigger other failure events or even start evicting nodes. You can see why this would be bad in a cloud environment.
>
>
> Has this scenario been addressed at all?
>
> Cheers,
> Luke.
>
>
> Luke Raimbach?
> Senior HPC Data and Storage Systems Engineer
> The Francis Crick Institute
> Gibbs Building
> 215 Euston Road
> London NW1 2BE
>
> E: luke.raimbach at crick.ac.uk
> W: www.crick.ac.uk
>
> The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
> The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
From S.J.Thompson at bham.ac.uk Tue Jun 16 09:46:52 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Tue, 16 Jun 2015 08:46:52 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
I didn;t think that the *current* Manilla driver user GPFS protocol, but
sat on top of Ganesha server.
Simon
On 16/06/2015 09:41, "Adam Huffman" wrote:
>
>The presentation of shared storage by Manila isn?t necessarily via NFS.
>Some of the drivers, I believe the GPFS one amongst them, allow some form
>of native connection either via the guest or via a VirtFS connection to
>the client on the hypervisor.
>
>Best Wishes,
>Adam
>
>
>?
>
>
>
>
>
>> On 16 Jun 2015, at 08:36, Luke Raimbach
>>wrote:
>>
>> So as I understand things, Manila is an OpenStack component which
>>allows tenants to create and destroy shares for their instances which
>>would be accessed over NFS. Perhaps I?ve not done enough research in to
>>this though ? I?m also not an OpenStack expert.
>>
>> The tenants don?t have root access to the file system, but the Manila
>>component must act as a wrapper to file system administrative
>>equivalents like mmcrfileset, mmdelfileset, link and unlink. The shares
>>are created as GPFS filesets which are then presented over NFS.
>>
>> The unlinking of the fileset worries me for the reasons stated
>>previously.
>>
>> From: gpfsug-discuss-bounces at gpfsug.org
>>[mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward
>> Sent: 15 June 2015 15:00
>> To: gpfsug main discussion list
>> Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
>>
>> Perhaps I misunderstand here, but if the tenants have administrative
>>(ie:root) privileges to the underlying file system management commands I
>>think mmunlinkfileset might be a minor concern here. There are FAR more
>>destructive things that could occur.
>>
>> I am not an OpenStack expert and I've not even looked at anything past
>>Kilo, but my understanding was that these commands were not necessary
>>for tenants. They access a virtual block device that backs to GPFS,
>>correct?
>>
>> Ed Wahl
>> OSC
>>
>>
>>
>> ++
>> From: gpfsug-discuss-bounces at gpfsug.org
>>[gpfsug-discuss-bounces at gpfsug.org] on behalf of Luke Raimbach
>>[Luke.Raimbach at crick.ac.uk]
>> Sent: Monday, June 15, 2015 4:35 AM
>> To: gpfsug-discuss at gpfsug.org
>> Subject: [gpfsug-discuss] OpenStack Manila Driver
>>
>> Dear All,
>>
>> We are looking forward to using the manila driver for auto-provisioning
>>of file shares using GPFS. However, I have some concerns...
>>
>> Manila presumably gives tenant users access to file system commands
>>like mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset
>>quiesces the file system, there is potentially an impact from one tenant
>>on another - i.e. someone unlinking and deleting a lot of filesets
>>during a tenancy cleanup might cause a cluster pause long enough to
>>trigger other failure events or even start evicting nodes. You can see
>>why this would be bad in a cloud environment.
>>
>>
>> Has this scenario been addressed at all?
>>
>> Cheers,
>> Luke.
>>
>>
>> Luke Raimbach?
>> Senior HPC Data and Storage Systems Engineer
>> The Francis Crick Institute
>> Gibbs Building
>> 215 Euston Road
>> London NW1 2BE
>>
>> E: luke.raimbach at crick.ac.uk
>> W: www.crick.ac.uk
>>
>> The Francis Crick Institute Limited is a registered charity in England
>>and Wales no. 1140062 and a company registered in England and Wales no.
>>06885462, with its registered office at 215 Euston Road, London NW1 2BE.
>> The Francis Crick Institute Limited is a registered charity in England
>>and Wales no. 1140062 and a company registered in England and Wales no.
>>06885462, with its registered office at 215 Euston Road, London NW1 2BE.
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>The Francis Crick Institute Limited is a registered charity in England
>and Wales no. 1140062 and a company registered in England and Wales no.
>06885462, with its registered office at 215 Euston Road, London NW1 2BE.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at gpfsug.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
From Luke.Raimbach at crick.ac.uk Tue Jun 16 09:48:50 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Tue, 16 Jun 2015 08:48:50 +0000
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To: <1434444030.15671.128.camel@buzzard.phy.strath.ac.uk>
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
<1434444030.15671.128.camel@buzzard.phy.strath.ac.uk>
Message-ID:
[SNIP]
>> The tenants don?t have root access to the file system, but the Manila
>> component must act as a wrapper to file system administrative
>> equivalents like mmcrfileset, mmdelfileset, link and unlink. The
>> shares are created as GPFS filesets which are then presented over NFS.
>>
> What makes you think the it creates filesets as opposed to just sharing out a normal directory? I had a quick peruse over the documentation and source code and saw no mention of filesets, though I could have missed it.
I think you are right. Looking over the various resources I have available, the creation, deletion, linking and unlinking of filesets is not implemented, but commented on as needing to be done.
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
From jonathan at buzzard.me.uk Tue Jun 16 10:25:45 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 16 Jun 2015 10:25:45 +0100
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
<1434444030.15671.128.camel@buzzard.phy.strath.ac.uk>
Message-ID: <1434446745.15671.134.camel@buzzard.phy.strath.ac.uk>
On Tue, 2015-06-16 at 08:48 +0000, Luke Raimbach wrote:
[SNIP]
> I think you are right. Looking over the various resources I have
> available, the creation, deletion, linking and unlinking of filesets is
> not implemented, but commented on as needing to be done.
That's going to be a right barrel of laughs as reliability goes out the
window if they do implement it.
JAB.
--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.
From billowen at us.ibm.com Thu Jun 18 05:22:25 2015
From: billowen at us.ibm.com (Bill Owen)
Date: Wed, 17 Jun 2015 22:22:25 -0600
Subject: [gpfsug-discuss] OpenStack Manila Driver
In-Reply-To:
References:
<9DA9EC7A281AC7428A9618AFDC49049955A572AA@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
Hi Luke,
Your explanation below is correct, with some minor clarifications
Manila is an OpenStack project which allows storage admins to create and
destroy filesystem shares and make those available to vm instances and
bare metal servers which would be accessed over NFS. The Manila driver
runs in the control plane and creates a new gpfs independent fileset for
each new share. It provides automation for giving vm's (and also bare
metal servers) acces to the shares so that they can mount and use the
share. There is work being done to allow automating the mount process when
the vm instance boots.
The tenants don?t have root access to the file system, but the Manila
component acts as a wrapper to file system administrative equivalents like
mmcrfileset, mmdelfileset, link and unlink. The shares are created as GPFS
filesets which are then presented over NFS.
The manila driver uses the following gpfs commands:
When a share is created:
mmcrfileset
mmlinkfileset
mmsetquota
When a share is deleted:
mmunlinkfileset
mmdelfileset
Snapshots of shares can be created and deleted:
mmcrsnapshot
mmdelsnapshot
Today, the GPFS Manila driver supports creating NFS exports to VMs. We are
considering adding native GPFS client support in the VM, but not sure if
the benefit justifies the extra complexity of having gpfs client in vm
image, and also the impact to cluster as vm's come up and down in a more
dynamic way than physical nodes.
For multi-tenant deployments, we recommend using a different filesystem per
tenant to provide better separation of data, and to minimize the "noisy
neighbor" effect for operations like mmunlinkfileset.
Here is a presentation that shows an overview of the GPFS Manila driver:
(See attached file: OpenStack_Storage_Manila_with_GPFS.pdf)
Perhaps this, and other GPFS & OpenStack topics could be the subject of a
future user group session.
Regards,
Bill Owen
billowen at us.ibm.com
GPFS and OpenStack
520-799-4829
From: Luke Raimbach
To: gpfsug main discussion list
Date: 06/16/2015 12:37 AM
Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
Sent by: gpfsug-discuss-bounces at gpfsug.org
So as I understand things, Manila is an OpenStack component which allows
tenants to create and destroy shares for their instances which would be
accessed over NFS. Perhaps I?ve not done enough research in to this though
? I?m also not an OpenStack expert.
The tenants don?t have root access to the file system, but the Manila
component must act as a wrapper to file system administrative equivalents
like mmcrfileset, mmdelfileset, link and unlink. The shares are created as
GPFS filesets which are then presented over NFS.
The unlinking of the fileset worries me for the reasons stated previously.
From: gpfsug-discuss-bounces at gpfsug.org [
mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Wahl, Edward
Sent: 15 June 2015 15:00
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] OpenStack Manila Driver
Perhaps I misunderstand here, but if the tenants have administrative
(ie:root) privileges to the underlying file system management commands I
think mmunlinkfileset might be a minor concern here. There are FAR more
destructive things that could occur.
I am not an OpenStack expert and I've not even looked at anything past
Kilo, but my understanding was that these commands were not necessary for
tenants. They access a virtual block device that backs to GPFS, correct?
Ed Wahl
OSC
++
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org]
on behalf of Luke Raimbach [Luke.Raimbach at crick.ac.uk]
Sent: Monday, June 15, 2015 4:35 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] OpenStack Manila Driver
Dear All,
We are looking forward to using the manila driver for auto-provisioning of
file shares using GPFS. However, I have some concerns...
Manila presumably gives tenant users access to file system commands like
mmlinkfileset and mmunlinkfileset. Given that mmunlinkfileset quiesces the
file system, there is potentially an impact from one tenant on another -
i.e. someone unlinking and deleting a lot of filesets during a tenancy
cleanup might cause a cluster pause long enough to trigger other failure
events or even start evicting nodes. You can see why this would be bad in a
cloud environment.
Has this scenario been addressed at all?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and
Wales no. 1140062 and a company registered in England and Wales no.
06885462, with its registered office at 215 Euston Road, London NW1 2BE.
The Francis Crick Institute Limited is a registered charity in England and
Wales no. 1140062 and a company registered in England and Wales no.
06885462, with its registered office at 215 Euston Road, London NW1 2BE.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenStack_Storage_Manila_with_GPFS.pdf
Type: application/pdf
Size: 354887 bytes
Desc: not available
URL:
From Luke.Raimbach at crick.ac.uk Thu Jun 18 13:30:40 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Thu, 18 Jun 2015 12:30:40 +0000
Subject: [gpfsug-discuss] Placement Policy Installation and RDM
Considerations
Message-ID:
Hi All,
Something I am thinking about doing is utilising the placement policy engine to insert custom metadata tags upon file creation, based on which fileset the creation occurs in. This might be to facilitate Research Data Management tasks that could happen later in the data lifecycle.
I am also thinking about allowing users to specify additional custom metadata tags (maybe through a fancy web interface) and also potentially give users control over creating new filesets (e.g. for scientists running new experiments). So? pretend this is a placement policy on my GPFS driven data-ingest platform:
RULE 'RDMTEST'
SET POOL 'instruments?
FOR FILESET
('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
WHERE SetXattr
('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a')
AND SetXattr
('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
RULE 'DEFAULT' SET POOL 'data'
The fileset name can be meaningless (as far as the user is concerned), but would be linked somewhere nice that they recognise ? say /gpfs/incoming/instrument1. The fileset, when it is created, would also be an AFM cache for its ?home? counterpart which exists on a much larger (also GPFS driven) pool of storage? so that my metadata tags are preserved, you see.
This potentially user driven activity might look a bit like this:
- User logs in to web interface and creates new experiment
- Filesets (system-generated names) are created on ?home? and ?ingest? file systems and linked into the directory namespace wherever the user specifies
- AFM relationships are set up and established for the ingest (cache) fileset to write back to the AFM home fileset (probably Independent Writer mode)
- A set of ?default? policies are defined and installed on the cache file system to tag data for that experiment (the user can?t change these)
- The user now specifies additional metadata tags they want added to their experiment data (some of this might be captured through additional mandatory fields in the web form for instance)
- A policy for later execution by mmapplypolicy on the AFM home file system is created which looks for the tags generated at ingest-time and applies the extra user-defined tags
There?s much more that would go on later in the lifecycle to take care of automated HSM tiering, data publishing, movement and cataloguing of data onto external non GPFS file systems, etc. but I won?t go in to it here. My GPFS related questions are:
When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong.
What is the specific limitation for having a policy placement file no larger than 1MB?
Cheers,
Luke.
Luke Raimbach?
Senior HPC Data and Storage Systems Engineer
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE
E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From makaplan at us.ibm.com Thu Jun 18 14:18:34 2015
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 18 Jun 2015 09:18:34 -0400
Subject: [gpfsug-discuss] Placement Policy Installation and
RDM Considerations
In-Reply-To:
References:
Message-ID:
Yes, you can do this. In release 4.1.1 you can write SET POOL 'x'
ACTION(setXattr(...)) FOR FILESET(...) WHERE ...
which looks nicer to some people than WHERE ( ... ) AND setXattr(...)
Answers:
(1) No need to quiesce. As the new policy propagates, nodes begin using
it. So there can be a transition time when node A may be using the new
policy but Node B has not started
using it yet. If that is undesirable, you can quiesce.
(2) Yes, 1MB is a limit on the total size in bytes of your policy rules.
Do you have a real need for more? Would you please show us such a
scenario? Beware that policy rules take some cpu cycles to evaluate... So
if for example, if you had several thousand SET POOL rules, you might
notice some impact to file creation time.
--marc of GPFS
From: Luke Raimbach
...
RULE 'RDMTEST'
SET POOL 'instruments?
FOR FILESET
('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
WHERE SetXattr
('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a')
AND SetXattr
('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
RULE 'DEFAULT' SET POOL 'data'
...
(1) When I install a placement policy into the file system, does the file
system need to quiesce? My suspicion is yes, because the policy needs to
be consistent on all nodes performing I/O, but I may be wrong.
...
(2) What is the specific limitation for having a policy placement file no
larger than 1MB?
...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From S.J.Thompson at bham.ac.uk Thu Jun 18 14:27:52 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Thu, 18 Jun 2015 13:27:52 +0000
Subject: [gpfsug-discuss] Placement Policy Installation and RDM
Considerations
In-Reply-To:
References:
Message-ID:
I can see exactly where Luke?s suggestion would be applicable.
We might have several hundred active research projects which would have some sort of internal identifier, so I can see why you?d want to do this sort of tagging as it would allow a policy scan to find files related to specific projects (for example).
Simon
From: Marc A Kaplan >
Reply-To: gpfsug main discussion list >
Date: Thursday, 18 June 2015 14:18
To: gpfsug main discussion list >, "luke.raimbach at crick.ac.uk" >
Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations
(2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From Luke.Raimbach at crick.ac.uk Thu Jun 18 14:35:32 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Thu, 18 Jun 2015 13:35:32 +0000
Subject: [gpfsug-discuss] Placement Policy Installation and
RDM Considerations
In-Reply-To:
References:
Message-ID:
Hi Marc,
Thanks for the pointer to the updated syntax. That indeed looks nicer.
(1) Asynchronous policy propagation sounds good in our scenario. We don?t want to potentially interrupt other running experiments by having to quiesce the filesystem for a new one coming online. It is useful to know that you could quiesce if desired. Presumably this is a secret flag one might pass to mmchpolicy?
(2) I was concerned about the evaluation time if I tried to set all extended attributes at creation time. That?s why I thought about adding a few ?system? defined tags which could later be used to link the files to an asynchronously applied policy on the home cluster. I think I calculated around 4,000 rules (dependent on the size of the attribute names and values), which might limit the number of experiments supported on a single ingest file system. However, I can?t envisage we will ever have 4,000 experiments running at once! I was really interested in why the limitation existed from a file-system architecture point of view.
Thanks for the responses.
Luke.
From: Marc A Kaplan [mailto:makaplan at us.ibm.com]
Sent: 18 June 2015 14:19
To: gpfsug main discussion list; Luke Raimbach
Subject: Re: [gpfsug-discuss] Placement Policy Installation and RDM Considerations
Yes, you can do this. In release 4.1.1 you can write SET POOL 'x' ACTION(setXattr(...)) FOR FILESET(...) WHERE ...
which looks nicer to some people than WHERE ( ... ) AND setXattr(...)
Answers:
(1) No need to quiesce. As the new policy propagates, nodes begin using it. So there can be a transition time when node A may be using the new policy but Node B has not started
using it yet. If that is undesirable, you can quiesce.
(2) Yes, 1MB is a limit on the total size in bytes of your policy rules. Do you have a real need for more? Would you please show us such a scenario? Beware that policy rules take some cpu cycles to evaluate... So if for example, if you had several thousand SET POOL rules, you might notice some impact to file creation time.
--marc of GPFS
From: Luke Raimbach >
...
RULE 'RDMTEST'
SET POOL 'instruments?
FOR FILESET
('%GPFSRDM%10.01013%RDM%0ab34906-5357-4ca0-9d19-a470943db30a%RDM%8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
WHERE SetXattr
('user.rdm.parent','0ab34906-5357-4ca0-9d19-a470943db30a')
AND SetXattr
('user.rdm.ingestor','8fc2395d-64c0-4ebd-8c71-0d2d34b3c1c0')
RULE 'DEFAULT' SET POOL 'data'
...
(1) When I install a placement policy into the file system, does the file system need to quiesce? My suspicion is yes, because the policy needs to be consistent on all nodes performing I/O, but I may be wrong.
...
(2) What is the specific limitation for having a policy placement file no larger than 1MB?
...
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From massimo_fumagalli at it.ibm.com Thu Jun 18 14:38:00 2015
From: massimo_fumagalli at it.ibm.com (Massimo Fumagalli)
Date: Thu, 18 Jun 2015 15:38:00 +0200
Subject: [gpfsug-discuss] ILM question
Message-ID:
Please, I need to know a simple question.
Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating
files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to
library).
Then we need to read a file that has been moved to library (or Tier1).
Will be file copied back to Tier 0? Or read will be executed directly from
Library or Tier1 ? since there can be performance issue
Regards
Max
IBM Italia S.p.A.
Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI)
Cap. Soc. euro 347.256.998,80
C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
Societ? con unico azionista
Societ? soggetta all?attivit? di direzione e coordinamento di
International Business Machines Corporation
(Salvo che sia diversamente indicato sopra / Unless stated otherwise
above)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 10661 bytes
Desc: not available
URL:
From ewahl at osc.edu Thu Jun 18 15:08:29 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Thu, 18 Jun 2015 14:08:29 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by themselves?
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts?
In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there.
I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner.
I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;)
Ed Wahl
OSC
From Paul.Sanchez at deshaw.com Thu Jun 18 15:52:07 2015
From: Paul.Sanchez at deshaw.com (Sanchez, Paul)
Date: Thu, 18 Jun 2015 14:52:07 +0000
Subject: [gpfsug-discuss] Member locations for Dev meeting organisation
In-Reply-To: <557ED229.50902@gpfsug.org>
References: <557ED229.50902@gpfsug.org>
Message-ID: <201D6001C896B846A9CFC2E841986AC1454124B2@mailnycmb2a.winmail.deshaw.com>
Thanks Jez,
D. E. Shaw is based in New York, NY. We have 3-4 engineers/architects who would attend.
Additionally, if you haven't heard from D. E. Shaw Research, they're next-door and have another 2.
-Paul Sanchez
Sent with Good (www.good.com)
________________________________
From: gpfsug-discuss-bounces at gpfsug.org on behalf of Jez Tucker (Chair)
Sent: Monday, June 15, 2015 9:24:57 AM
To: gpfsug main discussion list
Subject: [gpfsug-discuss] Member locations for Dev meeting organisation
Hello all
It would be very handy if all members could send me an email to:
chair at gpfsug.org with the City and Country in which you are located.
We're looking to place 'Meet the Devs' coffee-shops close to you, so
this would make planning several orders of magnitude easier.
I can infer from each member's email, but it's only 'mostly accurate'.
Stateside members - we're actively organising a first meet up near you
imminently, so please ping me your locations.
All the best,
Jez (Chair)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From makaplan at us.ibm.com Thu Jun 18 16:36:49 2015
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 18 Jun 2015 11:36:49 -0400
Subject: [gpfsug-discuss] Placement Policy Installation and
RDM Considerations
In-Reply-To:
References:
Message-ID:
(1) There is no secret flag. I assume that the existing policy is okay
but the new one is better. So start using the better one ASAP, but why
stop the system if you don't have to?
The not secret way to quiesce/resume a filesystem without unmounting is
fsctl {suspend | suspend-write | resume};
(2) The policy rules text is passed as a string through a GPFS rpc
protocol (not a standard RPC) and the designer/coder chose 1MB as a
safety-limit. I think it could be increased, but suppose you did have 4000
rules, each 200 bytes - you'd be at 800KB, still short of the 1MB limit.
(x) Personally, I wouldn't worry much about setting, say 10 extended
attribute values in each rule. I'd worry more about the impact of having
100s of rules.
(y) When designing/deploying a new GPFS filesystem, consider explicitly
setting the inode size so that all anticipated extended attributes will be
stored in the inode, rather than spilling into other disk blocks. See
mmcrfs ... -i InodeSize. You can build a test filesystem with just one
NSD/LUN and test your anticipated usage. Use tsdbfs ... xattr ... to
see how EAs are stored. Caution: tsdbfs display commands are harmless,
BUT there are some patch and patch-like subcommands that could foul up
your filesystem.
From: Luke Raimbach
Hi Marc,
Thanks for the pointer to the updated syntax. That indeed looks nicer.
(1) Asynchronous policy propagation sounds good in our scenario. We
don?t want to potentially interrupt other running experiments by having to
quiesce the filesystem for a new one coming online. It is useful to know
that you could quiesce if desired. Presumably this is a secret flag one
might pass to mmchpolicy?
(2) I was concerned about the evaluation time if I tried to set all
extended attributes at creation time. That?s why I thought about adding a
few ?system? defined tags which could later be used to link the files to
an asynchronously applied policy on the home cluster. I think I calculated
around 4,000 rules (dependent on the size of the attribute names and
values), which might limit the number of experiments supported on a single
ingest file system. However, I can?t envisage we will ever have 4,000
experiments running at once! I was really interested in why the limitation
existed from a file-system architecture point of view.
Thanks for the responses.
Luke.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From zgiles at gmail.com Thu Jun 18 17:02:54 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Thu, 18 Jun 2015 12:02:54 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves?
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
Sorry to hear about the problems you've had recently. It's frustrating
when that happens.
I didn't have the exactly the same situation, but we had something
similar which may bring some light to a missing disks situation:
We had a dataOnly storage pool backed by a few building blocks each
consisting of several RAID controllers that were each direct attached
to a few servers. We had several of these sets all in one pool. Thus,
if a server failed it was fine, if a single link failed, it was fine.
Potentially we could do copies=2 and have multiple failure groups in a
single pool. If anything in the RAID arrays themselves failed, it was
OK, but a single whole RAID controller going down would take that
section of disks down. The number of copies was set to 1 on this pool.
One RAID controller went down, but the file system as a whole stayed
online. Our user experience was that Some users got IO errors of a
"file inaccessible" type (I don't remember the exact code). Other
users, and especially those mostly in other tiers continued to work as
normal. As we had mostly small files across this tier ( much smaller
than the GPFS block size ), most of the files were in one of the RAID
controllers or another, thus not striping really, so even the files in
other controllers on the same tier were also fine and accessible.
Bottom line is: Only the files that were missing gave errors, the
others were fine. Additionally, for missing files errors were reported
which apps could capture and do something about, wait, or retry later
-- not a D state process waiting forever or stale file handles.
I'm not saying this is the best way. We didn't intend for this to
happen. I suspect that stopping the disk would result in a similar
experience but more safely.
We asked GPFS devs if we needed to fsck after this since the tier just
went offline directly and we continued to use the rest of the system
while it was gone.. they said no it should be fine and missing blocks
will be taken care of. I assume this is true, but I have no explicit
proof, except that it's still working and nothing seemed to be
missing.
I guess some questions for the dev's would be:
* Is this safe / advisable to do the above either directly or via a
stop and then down the array?
* Given that there is some client-side write caching in GPFS, if a
file is being written and an expected final destination goes offline
mid-write, where does the block go?
+ If a whole pool goes offline, will it pick another pool or error?
+ If it's a disk in a pool, will it reevaluate and round-robin to
the next disk, or just fail since it had already decided where to
write?
Hope this helps a little.
On Thu, Jun 18, 2015 at 10:08 AM, Wahl, Edward wrote:
> We had a unique situation with one of our many storage arrays occur in the past couple of days and it brought up a question I've had before. Is there a better way to disable a Storage Pool by itself rather than 'mmchdisk stop' the entire list of disks from that pool or mmfsctl and exclude things, etc? Thoughts?
>
> In our case our array lost all raid protection in a certain pool (8+2) due to a hardware failure, and started showing drive checkcondition errors on other drives in the array. Yikes! This pool itself is only about 720T and is backed by tape, but who wants to restore that? Even with SOBAR/HSM that would be a loooong week. ^_^ We made the decision to take the entire file system offline during the repair/rebuild, but I would like to have run all other pools save this one in a simpler manner than we took to get there.
>
> I'm interested in people's experiences here for future planning and disaster recovery. GPFS itself worked exactly as we had planned and expected but I think there is room to improve the file system if I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner.
>
> I may not be expressing myself here in the best manner possible. Bit of sleep deprivation after the last couple of days. ;)
>
> Ed Wahl
> OSC
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Zach Giles
zgiles at gmail.com
From zgiles at gmail.com Thu Jun 18 17:06:33 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Thu, 18 Jun 2015 12:06:33 -0400
Subject: [gpfsug-discuss] ILM question
In-Reply-To:
References:
Message-ID:
I would expect it to return to one of your online tiers. If you tier
between two storage pools, you can directly read and write those files.
Think of how LTFS works -- it's an external storage pool, so you need to
run an operation via an external command to give the file back to GPFS from
which you can read it. This is controlled via the policies and I assume you
would need to make a policy to specify where the file would be placed when
it comes back.
It would be fancy for someone to allow reading directly from an external
pool, but as far as I know, it has to hit a disk first.
What I don't know is: Will it begin streaming the files back to the user as
the blocks hit the disk, while other blocks are still coming in, or must
the whole file be recalled first?
On Thu, Jun 18, 2015 at 9:38 AM, Massimo Fumagalli <
massimo_fumagalli at it.ibm.com> wrote:
> Please, I need to know a simple question.
>
> Using Spectrum Scale 4.1.1, supposing to set ILM policy for migrating
> files from Filesystem Tier0 to TIer 1 or Tier2 (example using LTFS to
> library).
> Then we need to read a file that has been moved to library (or Tier1).
> Will be file copied back to Tier 0? Or read will be executed directly from
> Library or Tier1 ? since there can be performance issue
>
> Regards
> Max
>
>
> IBM Italia S.p.A.
> Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI)
> Cap. Soc. euro 347.256.998,80
> C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
> Societ? con unico azionista
> Societ? soggetta all?attivit? di direzione e coordinamento di
> International Business Machines Corporation
>
> (Salvo che sia diversamente indicato sopra / Unless stated otherwise above)
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
--
Zach Giles
zgiles at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 10661 bytes
Desc: not available
URL:
From chekh at stanford.edu Thu Jun 18 21:26:17 2015
From: chekh at stanford.edu (Alex Chekholko)
Date: Thu, 18 Jun 2015 13:26:17 -0700
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <55832969.4050901@stanford.edu>
mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk
suspend
# seems pretty simple to me
Then I guess you also have to modify your policy rules which relate to
that pool.
You're asking for a convenience wrapper script for a super-uncommon
situation?
On 06/18/2015 09:02 AM, Zachary Giles wrote:
> I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner.
--
Alex Chekholko chekh at stanford.edu
From ewahl at osc.edu Thu Jun 18 21:36:48 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Thu, 18 Jun 2015 20:36:48 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools
by themselves?
In-Reply-To: <55832969.4050901@stanford.edu>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
,
<55832969.4050901@stanford.edu>
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58A6A@CIO-KRC-D1MBX02.osuad.osu.edu>
I'm not sure it's so uncommon, but yes. (and your line looks suspiciously like mine did) I've had other situations where it would have been nice to do maintenance on a single storage pool. Maybe this is a "scale" issue when you get too large and should maybe have multiple file systems instead? Single name space is nice for users though.
Plus I was curious what others had done in similar situations.
I guess I could do what IBM does and just write the stupid script, name it "ts-something" and put a happy wrapper up front with a mm-something name. ;)
Just FYI: 'suspend' does NOT stop I/O. Only stops new block creation,so 'stop' was what I did.
>From the man page: "...Existing data on a suspended disk may still be read or updated."
Ed Wahl
OSC
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Alex Chekholko [chekh at stanford.edu]
Sent: Thursday, June 18, 2015 4:26 PM
To: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves?
mmlsdisk fs | grep pool | awk '{print $1} | tr '\n' ';'| xargs mmchdisk
suspend
# seems pretty simple to me
Then I guess you also have to modify your policy rules which relate to
that pool.
You're asking for a convenience wrapper script for a super-uncommon
situation?
On 06/18/2015 09:02 AM, Zachary Giles wrote:
> I could "turn down" an entire Storage Pool that did not have metadata for other pools on it, in a simpler manner.
--
Alex Chekholko chekh at stanford.edu
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
From makaplan at us.ibm.com Thu Jun 18 22:01:01 2015
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 18 Jun 2015 17:01:01 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves?
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
What do you see as the pros and cons of using GPFS Native Raid and
configuring your disk arrays as JBODs instead of using RAID in a box.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 21994 bytes
Desc: not available
URL:
From stijn.deweirdt at ugent.be Fri Jun 19 08:18:31 2015
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Fri, 19 Jun 2015 09:18:31 +0200
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <5583C247.1090609@ugent.be>
> What do you see as the pros and cons of using GPFS Native Raid and
> configuring your disk arrays as JBODs instead of using RAID in a box.
just this week we had an issue with bad disk (of the non-failing but
disrupting everything kind) and issues with the raid controller (db of
both controllers corrupted due to the one disk, controller reboot loops
etc etc).
but tech support pulled it through, although it took a while. i'm amased
what can be done with the hardware controllers (and i've seen my share
of recoveries ;)
my question to ibm wrt gss would be: can we have a "demo" of gss
recovering from eg a drawer failure (eg pull both sas connectors to the
drawer itself). i like the gss we have and the data recovery for single
disk failures, but i'm not sure how well it does with major component
failures. the demo could be the steps support would take to get it
running again (e.g. can gss recover from a drawer failure, assuming the
disks are still ok ofcourse).
stijn
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
From Luke.Raimbach at crick.ac.uk Fri Jun 19 08:47:10 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Fri, 19 Jun 2015 07:47:10 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools
by themselves?
In-Reply-To: <5583C247.1090609@ugent.be>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<5583C247.1090609@ugent.be>
Message-ID:
my question to ibm wrt gss would be: can we have a "demo" of gss recovering from eg a drawer failure (eg pull both sas connectors to the drawer itself). i like the gss we have and the data recovery for single disk failures, but i'm not sure how well it does with major component failures. the demo could be the steps support would take to get it running again (e.g. can gss recover from a drawer failure, assuming the disks are still ok ofcourse).
Ooh, we have a new one that's not in production yet. IBM say the latest GSS code should allow for a whole enclosure failure. I might try it before going in to production.
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.
From S.J.Thompson at bham.ac.uk Fri Jun 19 14:31:17 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 19 Jun 2015 13:31:17 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
Do you mean in GNR compared to using (san/IB) based hardware RAIDs?
If so, then GNR isn?t a scale-out solution - you buy a ?unit? and can add another ?unit? to the namespace, but I can?t add another 30TB of storage (say a researcher with a grant), where as with SAN based RAID controllers, I can go off and buy another storage shelf.
Simon
From: Marc A Kaplan >
Reply-To: gpfsug main discussion list >
Date: Thursday, 18 June 2015 22:01
To: gpfsug main discussion list >
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves?
What do you see as the pros and cons of using GPFS Native Raid and configuring your disk arrays as JBODs instead of using RAID in a box.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From makaplan at us.ibm.com Fri Jun 19 14:51:32 2015
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Fri, 19 Jun 2015 09:51:32 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
1. YES, Native Raid can recover from various failures: drawers, cabling,
controllers, power supplies, etc, etc.
Of course it must be configured properly so that there is no possible
single point of failure.
But yes, you should get your hands on a test rig and try out (simulate)
various failure scenarios and see how well it works.
2. I don't know the details of the packaged products, but I believe you
can license the software and configure huge installations,
comprising as many racks of disks, and associated hardware as you desire
or need. The software was originally designed to be used
in the huge HPC computing laboratories of certain governmental and
quasi-governmental institutions.
3. If you'd like to know and/or explore more, read the pubs, do the
experiments, and/or contact the IBM sales and support people.
IF by some chance you do not get satisfactory answers, come back here
perhaps we can get your inquiries addressed by the
GPFS design team. Like other complex products, there are bound to be some
questions that the sales and marketing people
can't quite address.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From S.J.Thompson at bham.ac.uk Fri Jun 19 14:56:33 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 19 Jun 2015 13:56:33 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group).
So under (3), I?m posting here asking if anyone from IBM knows anything different?
Thanks
Simon
From: Marc A Kaplan >
Reply-To: gpfsug main discussion list >
Date: Friday, 19 June 2015 14:51
To: gpfsug main discussion list >
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations,
comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used
in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From ewahl at osc.edu Fri Jun 19 15:37:13 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 19 Jun 2015 14:37:13 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
,
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>
They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now).
Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others)
Ed
________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk]
Sent: Friday, June 19, 2015 9:56 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group).
So under (3), I?m posting here asking if anyone from IBM knows anything different?
Thanks
Simon
From: Marc A Kaplan >
Reply-To: gpfsug main discussion list >
Date: Friday, 19 June 2015 14:51
To: gpfsug main discussion list >
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations,
comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used
in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From oehmes at us.ibm.com Fri Jun 19 15:49:32 2015
From: oehmes at us.ibm.com (Sven Oehme)
Date: Fri, 19 Jun 2015 14:49:32 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <201506191450.t5JEovUS018695@d01av01.pok.ibm.com>
GNR today is only sold as a packaged solution e.g. ESS.
The reason its not sold as SW only today is technical and its not true that this is not been pursued, its just not there yet and we cant discuss plans on a mailinglist.
Sven
Sent from IBM Verse
Wahl, Edward --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? ---
From:"Wahl, Edward" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:41 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
They seem to have killed dead the idea of just moving the software side of that by itself, wayyy.... back now. late 2013?/early 2014 Even though the components are fairly standard units(Engenio before, not sure now).
Ironic as GPFS/Spectrum Scale is storage software... Even more ironic for our site as we have the Controller based units these JBODs are from for our metadata and one of our many storage pools. (DDN on others)
Ed
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Simon Thompson (Research Computing - IT Services) [S.J.Thompson at bham.ac.uk]
Sent: Friday, June 19, 2015 9:56 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
Er. Everytime I?ve ever asked about GNR,, the response has been that its only available as packaged products as it has to understand things like the shelf controllers, disk drives etc, in order for things like the disk hospital to work. (And the last time I asked talked about GNR was in May at the User group).
So under (3), I?m posting here asking if anyone from IBM knows anything different?
Thanks
Simon
From: Marc A Kaplan
Reply-To: gpfsug main discussion list
Date: Friday, 19 June 2015 14:51
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
2. I don't know the details of the packaged products, but I believe you can license the software and configure huge installations,
comprising as many racks of disks, and associated hardware as you desire or need. The software was originally designed to be used
in the huge HPC computing laboratories of certain governmental and quasi-governmental institutions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From zgiles at gmail.com Fri Jun 19 15:56:19 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 19 Jun 2015 10:56:19 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
I think it's technically possible to run GNR on unsupported trays. You
may have to do some fiddling with some of the scripts, and/or you wont
get proper reporting.
Of course it probably violates 100 licenses etc etc etc.
I don't know of anyone who's done it yet. I'd like to do it.. I think
it would be great to learn it deeper by doing this.
On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing -
IT Services) wrote:
> Er. Everytime I?ve ever asked about GNR,, the response has been that its
> only available as packaged products as it has to understand things like the
> shelf controllers, disk drives etc, in order for things like the disk
> hospital to work. (And the last time I asked talked about GNR was in May at
> the User group).
>
> So under (3), I?m posting here asking if anyone from IBM knows anything
> different?
>
> Thanks
>
> Simon
>
> From: Marc A Kaplan
> Reply-To: gpfsug main discussion list
> Date: Friday, 19 June 2015 14:51
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by
> themselves? How about GPFS Native Raid?
>
> 2. I don't know the details of the packaged products, but I believe you can
> license the software and configure huge installations,
> comprising as many racks of disks, and associated hardware as you desire or
> need. The software was originally designed to be used
> in the huge HPC computing laboratories of certain governmental and
> quasi-governmental institutions.
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
--
Zach Giles
zgiles at gmail.com
From jtucker at pixitmedia.com Fri Jun 19 16:05:24 2015
From: jtucker at pixitmedia.com (Jez Tucker (Chair))
Date: Fri, 19 Jun 2015 16:05:24 +0100
Subject: [gpfsug-discuss] Handing over chair@
Message-ID: <55842FB4.9030705@gpfsug.org>
Hello all
This is my last post as Chair for the foreseeable future.
The next will come from Simon Thompson who assumes the post today for
the next two years.
I'm looking forward to Simon's tenure and wish him all the best with his
endeavours.
Myself, I'm moving over to UG Media Rep and will continue to support the
User Group and committee in its efforts. My new email is
jez.tucker at gpfsug.org
Please keep sending through your City and Country locations, they're
most helpful.
Have a great weekend.
All the best,
Jez
--
This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.
From oehmes at us.ibm.com Fri Jun 19 16:09:41 2015
From: oehmes at us.ibm.com (Sven Oehme)
Date: Fri, 19 Jun 2015 15:09:41 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
Message-ID: <201506191510.t5JFAdW2021605@d03av04.boulder.ibm.com>
Reporting is not the issue, one of the main issue is that we can't talk to the enclosure, which results in loosing the capability to replace disk drive or turn any fault indicators on.
It also prevents us to 'read' the position of a drive within a tray or fault domain within a enclosure, without that information we can't properly determine where we need to place strips of a track to prevent data access loss in case a enclosure or component fails.
Sven
Sent from IBM Verse
Zachary Giles --- Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid? ---
From:"Zachary Giles" To:"gpfsug main discussion list" Date:Fri, Jun 19, 2015 9:56 AMSubject:Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
I think it's technically possible to run GNR on unsupported trays. You may have to do some fiddling with some of the scripts, and/or you wont get proper reporting. Of course it probably violates 100 licenses etc etc etc. I don't know of anyone who's done it yet. I'd like to do it.. I think it would be great to learn it deeper by doing this. On Fri, Jun 19, 2015 at 9:56 AM, Simon Thompson (Research Computing - IT Services) wrote: > Er. Everytime I?ve ever asked about GNR,, the response has been that its > only available as packaged products as it has to understand things like the > shelf controllers, disk drives etc, in order for things like the disk > hospital to work. (And the last time I asked talked about GNR was in May at > the User group). > > So under (3), I?m posting here asking if anyone from IBM knows anything > different? > > Thanks > > Simon > > From: Marc A Kaplan > Reply-To: gpfsug main discussion list > Date: Friday, 19 June 2015 14:51 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by > themselves? How about GPFS Native Raid? > > 2. I don't know the details of the packaged products, but I believe you can > license the software and configure huge installations, > comprising as many racks of disks, and associated hardware as you desire or > need. The software was originally designed to be used > in the huge HPC computing laboratories of certain governmental and > quasi-governmental institutions. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Zach Giles zgiles at gmail.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jonathan at buzzard.me.uk Fri Jun 19 16:15:44 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Fri, 19 Jun 2015 16:15:44 +0100
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk>
On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote:
> I think it's technically possible to run GNR on unsupported trays. You
> may have to do some fiddling with some of the scripts, and/or you wont
> get proper reporting.
> Of course it probably violates 100 licenses etc etc etc.
> I don't know of anyone who's done it yet. I'd like to do it.. I think
> it would be great to learn it deeper by doing this.
>
One imagines that GNR uses the SCSI enclosure services to talk to the
shelves.
https://en.wikipedia.org/wiki/SCSI_Enclosure_Services
https://en.wikipedia.org/wiki/SES-2_Enclosure_Management
Which would suggest that anything that supported these would work.
I did some experimentation with a spare EXP810 shelf a few years ago on
a FC-AL on Linux. Kind all worked out the box. The other experiment with
an EXP100 didn't work so well; with the EXP100 it would only work with
the 250GB and 400GB drives that came with the dam thing. With the EXP810
I could screw random SATA drives into it and it all worked. My
investigations concluded that the firmware on the EXP100 shelf
determined if the drive was supported, but I could not work out how to
upload modified firmware to the shelf.
JAB.
--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.
From stijn.deweirdt at ugent.be Fri Jun 19 16:23:18 2015
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Fri, 19 Jun 2015 17:23:18 +0200
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <558433E6.8030708@ugent.be>
hi marc,
> 1. YES, Native Raid can recover from various failures: drawers, cabling,
> controllers, power supplies, etc, etc.
> Of course it must be configured properly so that there is no possible
> single point of failure.
hmmm, this is not really what i was asking about. but maybe it's easier
in gss to do this properly (eg for 8+3 data protection, you only need 11
drawers if you can make sure the data+parity blocks are send to
different drawers (sort of per drawer failure group, but internal to the
vdisks), and the smallest setup is a gss24 which has 20 drawers).
but i can't rememeber any manual suggestion the admin can control this
(or is it the default?).
anyway, i'm certainly interested in any config whitepapers or guides to
see what is required for such setup. are these public somewhere? (have
really searched for them).
>
> But yes, you should get your hands on a test rig and try out (simulate)
> various failure scenarios and see how well it works.
is there a way besides presales to get access to such setup?
stijn
>
> 2. I don't know the details of the packaged products, but I believe you
> can license the software and configure huge installations,
> comprising as many racks of disks, and associated hardware as you desire
> or need. The software was originally designed to be used
> in the huge HPC computing laboratories of certain governmental and
> quasi-governmental institutions.
>
> 3. If you'd like to know and/or explore more, read the pubs, do the
> experiments, and/or contact the IBM sales and support people.
> IF by some chance you do not get satisfactory answers, come back here
> perhaps we can get your inquiries addressed by the
> GPFS design team. Like other complex products, there are bound to be some
> questions that the sales and marketing people
> can't quite address.
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
From jonathan at buzzard.me.uk Fri Jun 19 16:35:32 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Fri, 19 Jun 2015 16:35:32 +0100
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To: <558433E6.8030708@ugent.be>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<558433E6.8030708@ugent.be>
Message-ID: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk>
On Fri, 2015-06-19 at 17:23 +0200, Stijn De Weirdt wrote:
> hi marc,
>
>
> > 1. YES, Native Raid can recover from various failures: drawers, cabling,
> > controllers, power supplies, etc, etc.
> > Of course it must be configured properly so that there is no possible
> > single point of failure.
> hmmm, this is not really what i was asking about. but maybe it's easier
> in gss to do this properly (eg for 8+3 data protection, you only need 11
> drawers if you can make sure the data+parity blocks are send to
> different drawers (sort of per drawer failure group, but internal to the
> vdisks), and the smallest setup is a gss24 which has 20 drawers).
> but i can't rememeber any manual suggestion the admin can control this
> (or is it the default?).
>
I got the impression that GNR was more in line with the Engenio dynamic
disk pools
http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx
http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf
That is traditional RAID sucks with large numbers of big drives.
JAB.
--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.
From bsallen at alcf.anl.gov Fri Jun 19 17:05:15 2015
From: bsallen at alcf.anl.gov (Allen, Benjamin S.)
Date: Fri, 19 Jun 2015 16:05:15 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To: <1434726944.9504.7.camel@buzzard.phy.strath.ac.uk>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<1434726944.9504.7.camel@buzzard.phy.strath.ac.uk>
Message-ID:
> One imagines that GNR uses the SCSI enclosure services to talk to the
> shelves.
It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way.
So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology.
Ben
> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote:
>
> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote:
>> I think it's technically possible to run GNR on unsupported trays. You
>> may have to do some fiddling with some of the scripts, and/or you wont
>> get proper reporting.
>> Of course it probably violates 100 licenses etc etc etc.
>> I don't know of anyone who's done it yet. I'd like to do it.. I think
>> it would be great to learn it deeper by doing this.
>>
>
> One imagines that GNR uses the SCSI enclosure services to talk to the
> shelves.
>
> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services
> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management
>
> Which would suggest that anything that supported these would work.
>
> I did some experimentation with a spare EXP810 shelf a few years ago on
> a FC-AL on Linux. Kind all worked out the box. The other experiment with
> an EXP100 didn't work so well; with the EXP100 it would only work with
> the 250GB and 400GB drives that came with the dam thing. With the EXP810
> I could screw random SATA drives into it and it all worked. My
> investigations concluded that the firmware on the EXP100 shelf
> determined if the drive was supported, but I could not work out how to
> upload modified firmware to the shelf.
>
> JAB.
>
> --
> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
> Fife, United Kingdom.
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
From peserocka at gmail.com Fri Jun 19 17:09:44 2015
From: peserocka at gmail.com (Pete Sero)
Date: Sat, 20 Jun 2015 00:09:44 +0800
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<1434726944.9504.7.camel@buzzard.phy.strath.ac.uk>
Message-ID:
vi my_enclosures.conf
fwiw
Peter
On 2015 Jun 20 Sat, at 24:05, Allen, Benjamin S. wrote:
>> One imagines that GNR uses the SCSI enclosure services to talk to the
>> shelves.
>
> It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way.
>
> So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology.
>
> Ben
>
>> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote:
>>
>> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote:
>>> I think it's technically possible to run GNR on unsupported trays. You
>>> may have to do some fiddling with some of the scripts, and/or you wont
>>> get proper reporting.
>>> Of course it probably violates 100 licenses etc etc etc.
>>> I don't know of anyone who's done it yet. I'd like to do it.. I think
>>> it would be great to learn it deeper by doing this.
>>>
>>
>> One imagines that GNR uses the SCSI enclosure services to talk to the
>> shelves.
>>
>> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services
>> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management
>>
>> Which would suggest that anything that supported these would work.
>>
>> I did some experimentation with a spare EXP810 shelf a few years ago on
>> a FC-AL on Linux. Kind all worked out the box. The other experiment with
>> an EXP100 didn't work so well; with the EXP100 it would only work with
>> the 250GB and 400GB drives that came with the dam thing. With the EXP810
>> I could screw random SATA drives into it and it all worked. My
>> investigations concluded that the firmware on the EXP100 shelf
>> determined if the drive was supported, but I could not work out how to
>> upload modified firmware to the shelf.
>>
>> JAB.
>>
>> --
>> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
>> Fife, United Kingdom.
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
From zgiles at gmail.com Fri Jun 19 17:12:53 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 19 Jun 2015 12:12:53 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<1434726944.9504.7.camel@buzzard.phy.strath.ac.uk>
Message-ID:
Ya, that's why I mentioned you'd probably have to fiddle with some
scripts or something to help GNR figure out where disks are. Is
definitely known that you can't just use any random enclosure given
that GNR depends highly on the topology. Maybe in the future there
would be a way to specify the topology or that a drive is at a
specific position.
On Fri, Jun 19, 2015 at 12:05 PM, Allen, Benjamin S.
wrote:
>> One imagines that GNR uses the SCSI enclosure services to talk to the
>> shelves.
>
> It does. It just isn't aware of SAS topologies outside of the supported hardware configuration. As Sven mentioned the issue is mapping out which drive is which, which enclosure is which, etc. That's not exactly trivial todo in a completely dynamic way.
>
> So while GNR can "talk" to the enclosures and disks, it just doesn't have any built-in logic today to be aware of every SAS enclosure and topology.
>
> Ben
>
>> On Jun 19, 2015, at 10:15 AM, Jonathan Buzzard wrote:
>>
>> On Fri, 2015-06-19 at 10:56 -0400, Zachary Giles wrote:
>>> I think it's technically possible to run GNR on unsupported trays. You
>>> may have to do some fiddling with some of the scripts, and/or you wont
>>> get proper reporting.
>>> Of course it probably violates 100 licenses etc etc etc.
>>> I don't know of anyone who's done it yet. I'd like to do it.. I think
>>> it would be great to learn it deeper by doing this.
>>>
>>
>> One imagines that GNR uses the SCSI enclosure services to talk to the
>> shelves.
>>
>> https://en.wikipedia.org/wiki/SCSI_Enclosure_Services
>> https://en.wikipedia.org/wiki/SES-2_Enclosure_Management
>>
>> Which would suggest that anything that supported these would work.
>>
>> I did some experimentation with a spare EXP810 shelf a few years ago on
>> a FC-AL on Linux. Kind all worked out the box. The other experiment with
>> an EXP100 didn't work so well; with the EXP100 it would only work with
>> the 250GB and 400GB drives that came with the dam thing. With the EXP810
>> I could screw random SATA drives into it and it all worked. My
>> investigations concluded that the firmware on the EXP100 shelf
>> determined if the drive was supported, but I could not work out how to
>> upload modified firmware to the shelf.
>>
>> JAB.
>>
>> --
>> Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
>> Fife, United Kingdom.
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Zach Giles
zgiles at gmail.com
From makaplan at us.ibm.com Fri Jun 19 19:45:19 2015
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Fri, 19 Jun 2015 14:45:19 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
,
<9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
OOps... here is the official statement:
GPFS Native RAID (GNR) is available on the following: v IBM Power? 775
Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a
high-capacity, high-performance storage solution that combines IBM System
x servers, storage enclosures, and drives, software (including GPFS Native
RAID), and networking components. GSS uses a building-block approach to
create highly-scalable storage for use in a broad range of application
environments.
I wonder what specifically are the problems you guys see with the "GSS
building-block" approach to ... highly-scalable...?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stijn.deweirdt at ugent.be Fri Jun 19 20:01:04 2015
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Fri, 19 Jun 2015 21:01:04 +0200
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To: <1434728132.9504.10.camel@buzzard.phy.strath.ac.uk>
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu> <558433E6.8030708@ugent.be>
<1434728132.9504.10.camel@buzzard.phy.strath.ac.uk>
Message-ID: <558466F0.8000300@ugent.be>
>>> 1. YES, Native Raid can recover from various failures: drawers, cabling,
>>> controllers, power supplies, etc, etc.
>>> Of course it must be configured properly so that there is no possible
>>> single point of failure.
>> hmmm, this is not really what i was asking about. but maybe it's easier
>> in gss to do this properly (eg for 8+3 data protection, you only need 11
>> drawers if you can make sure the data+parity blocks are send to
>> different drawers (sort of per drawer failure group, but internal to the
>> vdisks), and the smallest setup is a gss24 which has 20 drawers).
>> but i can't rememeber any manual suggestion the admin can control this
>> (or is it the default?).
>>
>
> I got the impression that GNR was more in line with the Engenio dynamic
> disk pools
well, it's uses some crush-like placement and some parity encoding
scheme (regular raid6 for the DDP, some flavour of EC for GNR), but
other then that, not much resemblence.
DDP does not give you any control over where the data blocks are stored.
i'm not sure about GNR, (but DDP does not state anywhere they are drawer
failure proof ;).
but GNR is more like a DDP then e.g. a ceph EC pool, in the sense that
the hosts needs to see all disks (similar to the controller that needs
access to the disks).
>
> http://www.netapp.com/uk/technology/dynamic-disk-pools.aspx
>
> http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf
>
> That is traditional RAID sucks with large numbers of big drives.
(btw it's one of those that we saw fail (and get recovered by tech
support!) this week. tip of the week: turn on the SMmonitor service on
at least one host, it's actually useful for something).
stijn
>
>
> JAB.
>
From S.J.Thompson at bham.ac.uk Fri Jun 19 20:17:32 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 19 Jun 2015 19:17:32 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
,
<9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>,
Message-ID:
My understanding I that GSS and IBM ESS are sold as pre configured systems.
So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives.
So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system.
So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy.
Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com]
Sent: 19 June 2015 19:45
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
OOps... here is the official statement:
GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments.
I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...?
From zgiles at gmail.com Fri Jun 19 21:08:14 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 19 Jun 2015 16:08:14 -0400
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>
<9DA9EC7A281AC7428A9618AFDC49049955A58CD2@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID:
It's comparable to other "large" controller systems. Take the DDN
10K/12K for example: You don't just buy one more shelf of disks, or 5
disks at a time from Walmart. You buy 5, 10, or 20 trays and populate
enough disks to either hit your bandwidth or storage size requirement.
Generally changing from 5 to 10 to 20 requires support to come on-site
and recable it, and generally you either buy half or all the disks
slots worth of disks. The whole system is a building block and you buy
N of them to get up to 10-20PB of storage.
GSS is the same way, there are a few models and you just buy a packaged one.
Technically, you can violate the above constraints, but then it may
not work well and you probably can't buy it that way.
I'm pretty sure DDN's going to look at you funny if you try to buy a
12K with 30 drives.. :)
For 1PB (small), I guess just buy 1 GSS24 with smaller drives to save
money. Or, buy maybe just 2 NetAPP / LSI / Engenio enclosure with
buildin RAID, a pair of servers, and forget GNR.
Or maybe GSS22? :)
>From http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&appname=gpateam&supplier=897&letternum=ENUS114-098
"
Current high-density storage Models 24 and 26 remain available
Four new base configurations: Model 21s (1 2u JBOD), Model 22s (2 2u
JBODs), Model 24 (4 2u JBODs), and Model 26 (6 2u JBODs)
1.2 TB, 2 TB, 3 TB, and 4 TB hard drives available
200 GB and 800 GB SSDs are also available
The Model 21s is comprised of 24 SSD drives, and the Model 22s, 24s,
26s is comprised of SSD drives or 1.2 TB hard SAS drives
"
On Fri, Jun 19, 2015 at 3:17 PM, Simon Thompson (Research Computing -
IT Services) wrote:
>
> My understanding I that GSS and IBM ESS are sold as pre configured systems.
>
> So something like 2x servers with a fixed number of shelves. E.g. A GSS 24 comes with 232 drives.
>
> So whilst that might be 1Pb system (large scale), its essentially an appliance type approach and not scalable in the sense that it isn't supported add another storage system.
>
> So maybe its the way it has been productised, and perhaps gnr is technically capable of having more shelves added, but if that isn't a supports route for the product then its not something that as a customer I'd be able to buy.
>
> Simon
> ________________________________________
> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Marc A Kaplan [makaplan at us.ibm.com]
> Sent: 19 June 2015 19:45
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] Disabling individual Storage Pools by themselves? How about GPFS Native Raid?
>
> OOps... here is the official statement:
>
> GPFS Native RAID (GNR) is available on the following: v IBM Power? 775 Disk Enclosure. v IBM System x GPFS Storage Server (GSS). GSS is a high-capacity, high-performance storage solution that combines IBM System x servers, storage enclosures, and drives, software (including GPFS Native RAID), and networking components. GSS uses a building-block approach to create highly-scalable storage for use in a broad range of application environments.
>
> I wonder what specifically are the problems you guys see with the "GSS building-block" approach to ... highly-scalable...?
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Zach Giles
zgiles at gmail.com
From S.J.Thompson at bham.ac.uk Fri Jun 19 22:08:25 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 19 Jun 2015 21:08:25 +0000
Subject: [gpfsug-discuss] Disabling individual Storage Pools by
themselves? How about GPFS Native Raid?
In-Reply-To:
References: <9DA9EC7A281AC7428A9618AFDC49049955A5876A@CIO-KRC-D1MBX02.osuad.osu.edu>