From PATBYRNE at uk.ibm.com  Thu Oct  1 11:09:29 2015
From: PATBYRNE at uk.ibm.com (Patrick Byrne)
Date: Thu, 1 Oct 2015 10:09:29 +0000
Subject: [gpfsug-discuss] Problem Determination
Message-ID: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/6e900b95/attachment.htm>

From Robert.Oesterlin at nuance.com  Thu Oct  1 13:39:25 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Thu, 1 Oct 2015 12:39:25 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/856fd025/attachment.htm>

From bbanister at jumptrading.com  Fri Oct  2 17:44:24 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Fri, 2 Oct 2015 16:44:24 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05C8CE44@CHI-EXCHANGEW1.w2k.jumptrading.com>

I would like to strongly echo what Bob has stated, especially the documentation or wrong documentation, and I have in-lining some comments below.

I liken GPFS to a critical care patient at the hospital.  You have to check on the state regularly, know the running heart rate (e.g. waiters), the response of every component from disk, to networks, to server load, etc.  When a problem occurs, running tests (such as nsdperf)  to help isolate the problem quickly is crucial.  Capturing GPFS trace data is also very important if the problem isn?t obvious.  But then you have to wait for IBM support to parse the information and give you their analysis of the situation.  It would be great to get an advanced troubleshooting document that describes how to read the output of `mmfsadm dump` commands and the GPFS trace report that is generated.

Cheers,
-Bryan

From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Oesterlin, Robert
Sent: Thursday, October 01, 2015 7:39 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
                [Bryan: Also please, please provide a way to check whether or not the configuration parameters need to be changed.  I assume that there is a `mmfsadm dump` command that can tell you whether the config parameter needs to be changed, if not make one!  Just stating something like ?This could be increased to XX value for very large clusters? is not very helpful.

- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
                [Bryan: I know that Scott Fadden is a busy man, so I would recommend helping distribute the workload of maintaining the wiki documentation.  This data should be reviewed on a more regular basis, at least once for each major release I  would hope, and updated or deleted if found to be out of date.]

- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
                [Bryan: From what I?ve heard, IBM is actively working to make the deadlock amelioration logic better.  I agree that firing off traces can cause more problems, and we have turned off the automated collection as well.  We are going to work on enabling the collection of some data during these events to help ensure we get enough data for IBM to analyze the problem.]

- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.
                [Bryan: The GPFS callback facilities are very useful for setting up alerts, but not well documented or advertised by the GPFS manuals.  I hope to see more callback capabilities added to help monitor all aspects of the GPFS cluster and file systems]


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/d47c9d15/attachment.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct  2 17:58:41 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 2 Oct 2015 16:58:41 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>,
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545C7BCB331@EX10.adf.bham.ac.uk>


I agree on docs, particularly on mmdiag, I think things like --lroc are not documented. I'm also not sure that --network always gives accurate network stats. (we were doing some ha failure testing where we have split site in and fabrics, yet the network counters didn't change even when the local ib nsd servers were shut down).

It would be nice also to have a set of Icinga/Nagios plugins from IBM, maybe in samples whcich are updated on each release with new feature checks.

And not problem determination, but id really like to see an inflight non disruptive upgrade path. Particularly as we run vms off gpfs, its bot always practical or possible to move vms, so would be nice to have upgrade in flight (not suggesting this would be a quick thing to implement).

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Oesterlin, Robert [Robert.Oesterlin at nuance.com]
Sent: 01 October 2015 13:39
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


From ewahl at osc.edu  Fri Oct  2 19:00:46 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 2 Oct 2015 18:00:46 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>

I'm not yet in the 4.x release stream so this may be taken with a grain (or more) of salt as we say.

PLEASE keep the ability of commands to set -x or dump debug when the env DEBUG=1 is set.  This has been extremely useful over the years.   Granted I've never worked out why sometimes we see odd little  things like machines deciding they suddenly need an FPO license or one nsd server suddenly decides it's name is part of the FQDN instead of just it's hostname and only for certain commands, but it's DAMN useful.  Minor issues especially can be tracked down with it.

Undocumented features and logged items abound.  I'd say start there.  This is one area where it is definitely more art than science with Spectrum Scale (meh GPFS still sounds better. So does Shark. Can we go back to calling it the Shark Server Project?)

  Complete failure of the verbs layer and fallback to other defined networks would be nice to know about during operation. It's excellent about telling you at startup but not so much during operation, at least in 3.5.

 I imagine with the 'automated compatibility layer building' I'll be looking for some serious amounts of PD for the issues we _will_ see there.  We frequently build against kernels we are not yet running at this site, so this needs well documented PD and resolution.

Ed Wahl
OSC


________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
Sent: Thursday, October 01, 2015 6:09 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/5fa79ef0/attachment.htm>

From zgiles at gmail.com  Fri Oct  2 21:27:17 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 2 Oct 2015 16:27:17 -0400
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <CAMYZk=dnEN_uuAuC4Pk3AzmrPD22mX4ssVSqnKrJ1N1mKUXgpQ@mail.gmail.com>

I would like to see better performance metrics / counters from GPFS.
I know we already have mmpmon, which is generally really good -- I've
done some fun things with it and it has been a great tool. And, I
realize that there is supposedly a new monitoring framework in 4.x..
which I haven't played with yet.

But,
Generally it would be extremely helpful to get synchronized (across
all nodes) high accuracy counters of data flow, number of waiters,
page pool stats, distribution of data from one layer to another down
to NSDs.. etc etc etc.  I believe many of these counters already
exist, but they're hidden in some mmfsadm xx command that one needs to
troll through with possible performance implications. mmpmon can do
some of this, but it's only a handful of counters, it's hard to say
how synchronized the counters are across nodes, and I've personally
seen an mmpmon run go bad and take down a cluster.  It would be nice
if it were pushed out, or provided in a safe manner with the design
and expectation of "log-everything forever continuously".

As GSS/ESS systems start popping up, I realize they have this other
monitoring framework to watch the VD throughputs.. which is great.
But, that doesn't allow us to monitor more traditional types.
Would be nice to monitor it all together the same way so we don't
miss-out on monitoring half the infrastructure or buying a cluster
with some fancy GUI that can't do what we want..

-Zach


On Fri, Oct 2, 2015 at 2:00 PM, Wahl, Edward <ewahl at osc.edu> wrote:
> I'm not yet in the 4.x release stream so this may be taken with a grain (or
> more) of salt as we say.
>
> PLEASE keep the ability of commands to set -x or dump debug when the env
> DEBUG=1 is set.  This has been extremely useful over the years.   Granted
> I've never worked out why sometimes we see odd little  things like machines
> deciding they suddenly need an FPO license or one nsd server suddenly
> decides it's name is part of the FQDN instead of just it's hostname and only
> for certain commands, but it's DAMN useful.  Minor issues especially can be
> tracked down with it.
>
> Undocumented features and logged items abound.  I'd say start there.  This
> is one area where it is definitely more art than science with Spectrum Scale
> (meh GPFS still sounds better. So does Shark. Can we go back to calling it
> the Shark Server Project?)
>
>   Complete failure of the verbs layer and fallback to other defined networks
> would be nice to know about during operation. It's excellent about telling
> you at startup but not so much during operation, at least in 3.5.
>
>  I imagine with the 'automated compatibility layer building' I'll be looking
> for some serious amounts of PD for the issues we _will_ see there.  We
> frequently build against kernels we are not yet running at this site, so
> this needs well documented PD and resolution.
>
> Ed Wahl
> OSC
>
>
> ________________________________
> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org]
> on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
> Sent: Thursday, October 01, 2015 6:09 AM
> To: gpfsug-discuss at gpfsug.org
> Subject: [gpfsug-discuss] Problem Determination
>
> Hi all,
>
> As I'm sure some of you aware, problem determination is an area that we are
> looking to try and make significant improvements to over the coming releases
> of Spectrum Scale. To help us target the areas we work to improve and make
> it as useful as possible I am trying to get as much feedback as I can about
> different problems users have, and how people go about solving them.
>
> I am interested in hearing everything from day to day annoyances to problems
> that have caused major frustration in trying to track down the root cause.
> Where possible it would be great to hear how the problems were dealt with as
> well, so that others can benefit from your experience. Feel free to reply to
> the mailing list - maybe others have seen similar problems and could provide
> tips for the future - or to me directly if you'd prefer
> (patbyrne at uk.ibm.com).
>
> On a related note, in 4.1.1 there was a component added that monitors the
> state of the various protocols that are now supported (NFS, SMB, Object).
> The output from this is available with the 'mmces state' and 'mmces events'
> CLIs and I would like to get feedback from anyone who has had the chance
> make use of this. Is it useful? How could it be improved? We are looking at
> the possibility of extending this component to cover more than just
> protocols, so any feedback would be greatly appreciated.
>
> Thanks in advance,
>
> Patrick Byrne
> IBM Spectrum Scale - Development Engineer
> IBM Systems - Manchester Lab
> IBM UK Limited
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
Zach Giles
zgiles at gmail.com


From Luke.Raimbach at crick.ac.uk  Mon Oct  5 13:57:14 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Mon, 5 Oct 2015 12:57:14 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <AMXPR03MB0216D4B7DBCDE69469121EFB0480@AMXPR03MB021.eurprd03.prod.outlook.com>

Hi All,

When creating an independent inode space, I see the valid range for the number of inodes is between 1024 and 4294967294.

Is the ~4.2billion upper limit something that can be increased in the future?

I also see that the first 1024 inodes are immediately allocated upon creation. I assume these are allocated to internal data structures and are a copy of a subset of the first 4038 inodes allocated for new file systems? It would be useful to know if these internal structures are fixed for independent filesets and if they are not, what factors determine their layout (for performance purposes).

Many Thanks,
Luke.

Luke Raimbach?
Senior HPC Data and Storage Systems Engineer,
The Francis Crick Institute,
Gibbs Building,
215 Euston Road,
London NW1 2BE.

E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.

From usa-principal at gpfsug.org  Mon Oct  5 14:55:15 2015
From: usa-principal at gpfsug.org (usa-principal-gpfsug.org)
Date: Mon, 05 Oct 2015 09:55:15 -0400
Subject: [gpfsug-discuss] Final Reminder: Inaugural US "Meet the Developers"
Message-ID: <9656d0110c2be4b339ec5ce662409b8e@webmail.gpfsug.org>

A last reminder to check in with Janet if you have not done so already. 
Looking forward to this event on Wednesday this week.

Best,
Kristy

---

Hello Everyone,

Here is a reminder about our inaugural US "Meet the Developers" session. 
  Details are below, and please send an e-mail to Janet Ellsworth 
(janetell at us.ibm.com) by next Friday September 18th if you wish to 
attend. Janet is on the product management team for Spectrum Scale and 
is helping with the logistics for this first event.

Date:  Wednesday, October 7th
Place: IBM building at 590 Madison Avenue, New York City
Time:  12:30 to 5 PM (Lunch will be served at 12:30, and sessions will 
start between 1 and 1:30 PM.  Afternoon snacks will be served as well 
:-)

Agenda
IBM development architect to present the new protocols support that was 
released with Spectrum Scale 4.1.1 in June.
IBM developer to demo future Graphical User Interface
***Member of user community to present an experience with using Spectrum 
Scale (still seeking volunteers for this !)***
Open Q&A with the development team

We are happy to have heard from many of you so far who would like to 
attend.   We still have room however, so please get in touch by the 9/18 
date if you would like to attend.

***We also need someone to share an experience or use case scenario with 
Spectrum Scale for this event, so please let Janet know if you are 
willing to do that too.***

As you have likely seen, we are also working on the agenda and timing 
for day-long GPFS US UG event in Austin during November aligned with 
SC15 and there will be more details on that coming soon.


From secretary at gpfsug.org  Wed Oct  7 12:50:51 2015
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Wed, 07 Oct 2015 12:50:51 +0100
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <813d82bd5074b90c3a67acc85a03995b@webmail.gpfsug.org>

Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org


From service at metamodul.com  Wed Oct  7 16:06:56 2015
From: service at metamodul.com (service at metamodul.com)
Date: Wed, 07 Oct 2015 17:06:56 +0200
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <me1pxc2iifvcntr6uhjdasup.1444230416911@email.android.com>

Hi Claire,
I will attend the meeting.
Hans-Joachim Ehlers
MetaModul GmbH
Germany

Cheers
Hajo
Von Samsung Mobile gesendet

<div>-------- Urspr?ngliche Nachricht --------</div><div>Von: Secretary GPFS UG <secretary at gpfsug.org> </div><div>Datum:2015.10.07  13:50  (GMT+01:00) </div><div>An: gpfsug main discussion list <gpfsug-discuss at gpfsug.org> </div><div>Betreff: [gpfsug-discuss] Places available: Meet the Devs </div><div>
</div>Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/53b57a09/attachment.htm>

From Douglas.Hughes at DEShawResearch.com  Wed Oct  7 19:59:26 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Wed, 7 Oct 2015 18:59:26 +0000
Subject: [gpfsug-discuss] new member, first post
Message-ID: <FE83190899DE5B41BD7B5C3FD84CDF8F6199300D@mailnycmb5a.winmail.deshaw.com>


sitting here in the US GPFS UG meeting in NYC and just found out about this list.

We've been a GPFS user for many years, first with integrated DDN support, but now also with a GSS system. we have about 4PB of raw GPFS storage and 1 billion inodes. We keep our metadata on TMS ramsan for very fast policy execution for tiering and migration.

We use GPFS to hold the primary source data from our custom supercomputers. We have many policies executed periodically for managing the data, including writing certain files to dedicated fast pools and then migrating the data off to wide swaths of disk for read access from cluster clients.

One pain point, which I'm sure many of the rest of you have seen, restripe operations for just metadata are unnecessarily slow. If we experience a flash module failure and need to restripe, it also has to check all of the data. I have a feature request open to make metadata restripes only look at metadata (since it is on RamSan/FlashCache, this should be very fast) instead of scanning everything, which can and does take months with performance impacts.

Doug Hughes
D. E. Shaw Research, LLC.

Sent from my android device.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/77b16dfa/attachment.htm>

From chair at gpfsug.org  Thu Oct  8 20:37:05 2015
From: chair at gpfsug.org (GPFS UG Chair (Simon Thompson))
Date: Thu, 08 Oct 2015 20:37:05 +0100
Subject: [gpfsug-discuss] User group update
Message-ID: <D23C8471.1E4E7%chair@gpfsug.org>

Hi,

I thought I'd drop an update to the group on various admin things which
have been going on behind the scenes.

The first US meet the devs event was held yesterday, and I'm hoping
someone who went will be preparing a blog post to cover the event a
little. I know a bunch of people have joined the mailing list since then,
so welcome to the group to all of those!


** User Group Engagement with IBM **

I also met with Akhtar yesterday who is the IBM VP for Technical Computing
Developments (which includes Spectrum Scale). He was in the UK for a few
days at the IBM Manchester Labs, so we managed to squeeze a meeting to
talk a bit about the UG. I'm very pleased that Akhtar confirmed IBMs
commitment to help the user group in both the UK and USA with developer
support for the meet the devs and annual group meetings. I'd like to
extend my thanks to those at IBM who are actively supporting the group in
so many ways.

One idea we have been mulling over is filming the talks at next year's
events and then putting those on Youtube for people who can't get there.
IBM have given us tentative agreement to do this, subject to a few
conditions. Most importantly that the UG and IBM ensure we don't publish
customer or IBM items which are NDA/not for general public consumption.
I'm hopeful we can get this all approved and if we do, we'll be looking to
the community to help us out (anyone got digital camera equipment we might
be able to borrow, or some help with editing down afterwards?)

Whilst in Manchester I also met with Patrick to talk over the various
emails people have sent in about problem determination, which Patrick will
be taking to the dev meeting in a few weeks. It sounds like there are some
interesting ideas kicking about, so hopefully we'll get some value from
the user group input.

Some of the new features in 4.2 were also demo'd and for those who might
not have been to a meet the devs session and are interested in the
upcoming GUI, it is now in public beta, head over to developer works for
more details:
  
https://www.ibm.com/developerworks/community/forums/html/topic?id=4dc34bf1-
17d1-4dc0-af72-6dc5a3f93e82&ps=25


** User Group Feedback **

Over the past few months, I've also been collecting feedback from people,
either comments on the mailing list, or those who I've spoken to, which
was all collated and sent in to IBM, we'll hopefully be getting some
feedback on that in the next few weeks - there's a bunch of preliminary
answers now, but a few places we still need a bit of clarification.

There's also some longer term discussion going on about GPFS and cloud (in
particular to those of us in scientific areas). We'll feed that back as
and when we get responses we can share.

We'd like to ensure that we gather as much feedback from users so that we
can collectively take it to IBM, so please do continue to post comments
etc to the mailing list.


** Diary Dates **

A few dates for diaries:
  * Meet the Devs in Edinburgh - Friday 23rd October 2015
  * GPFS UG Meeting @ SC15 in Austin, USA - Sunday 15th November 2015
  * GPFS UG Meeting @ Computing Insight UK, Coventry, UK - Tuesday 8th
December 2015 (Note you must be registered also for CIUK)
  * GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
2016


** User Group Admin **

Within the committee, we've been talking about how we can extend the reach
of the group, so we may be reaching out to a few group members to take
this forward. Of course if anyone has suggestions on how we can ensure we
reach as many people as possible, please let me know, either via the
mailing list of directly by email.
I know there are lot of people on the mailing list who don't post
(regularly), so I'd be interested to hear if you find the group mailing
list discussion useful, if you feel there are barriers to asking
questions, or what you'd like to see coming out of the user group - please
feel free to email me directly if you'd like to comment on any of this!


We've also registered spectrumscale.org to point to the user group, so you
may start to see the group marketed as the Spectrum Scale User Group, but
rest assured, its still the same old GPFS User Group ;-)


Just a reminder that we made the mailing list so that only members can
post. This was to reduce the amount of spam coming in and being held for
moderation (and a few legit posts got lost this way). If you do want to
post, but not receive the emails, you can set this as an option in the
mailing list software.

Finally, I've also fixed the mailing list archives, so these are now
available at:
  http://www.gpfsug.org/pipermail/gpfsug-discuss/

Simon
GPFS UG, UK Chair


From L.A.Hurst at bham.ac.uk  Fri Oct  9 09:25:52 2015
From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst (IT Services))
Date: Fri, 9 Oct 2015 08:25:52 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23D3808.E75E%l.a.hurst@bham.ac.uk>

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
-- 
Laurence Hurst
Research Support, IT Services, University of Birmingham


From S.J.Thompson at bham.ac.uk  Fri Oct  9 10:00:11 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 9 Oct 2015 09:00:11 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
References: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D166BF67@EX10.adf.bham.ac.uk>


Both days. May 2016 is a two day event.

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Laurence Alexander Hurst (IT Services) [L.A.Hurst at bham.ac.uk]
Sent: 09 October 2015 09:25
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] User group update

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
--
Laurence Hurst
Research Support, IT Services, University of Birmingham


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:54:22 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:54:22 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>

>
>We've also registered spectrumscale.org to point to the user group, so you
>may start to see the group marketed as the Spectrum Scale User Group, but
>rest assured, its still the same old GPFS User Group ;-)

And this is just a test mail to ensure that mail to
gpfsug-discuss at spectrumscale.org gets through OK. The old address should
also still work.

Simon


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:55:55 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:55:55 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
References: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
Message-ID: <D23ED768.1E5A5%s.j.thompson@bham.ac.uk>


On 10/10/2015 14:54, "Simon Thompson (Research Computing - IT Services)"
<S.J.Thompson at bham.ac.uk> wrote:

>>
>>We've also registered spectrumscale.org to point to the user group, so
>>you
>>may start to see the group marketed as the Spectrum Scale User Group, but
>>rest assured, its still the same old GPFS User Group ;-)
>
>And this is just a test mail to ensure that mail to
>gpfsug-discuss at spectrumscale.org gets through OK. The old address should
>also still work.

And checking the old address still works fine as well.

Simon


From Robert.Oesterlin at nuance.com  Tue Oct 13 03:03:45 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 13 Oct 2015 02:03:45 +0000
Subject: [gpfsug-discuss] User group Meeting at SC15 - Registration
Message-ID: <F1459A73-6A82-4935-8518-C3F88DF2C794@nuance.com>

We?d like to have all those attending the user group meeting at SC15 to register ? details are below. Thanks to IBM for getting the space and arranging all the details. I?ll post a more detailed agenda soon.

Looking forward to meeting everyone!

Location:
JW Marriott
110 E 2nd Street
Austin, Texas
United States

Date and Time:
Sunday Nov 15, 1:00 PM?5:30 PM

Agenda:

- Latest IBM Spectrum Scale enhancements
- Future directions and roadmap* (NDA required)
- Newer usecases and User presentations

Registration:
Please register at the below link to book your seat. <https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales>
https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales


Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151013/23eca25e/attachment.htm>

From chair at spectrumscale.org  Sat Oct 17 20:51:50 2015
From: chair at spectrumscale.org (GPFS UG Chair (Simon Thompson))
Date: Sat, 17 Oct 2015 20:51:50 +0100
Subject: [gpfsug-discuss] Blog on USA Meet the Devs
Message-ID: <D2486566.1EFE5%chair@spectrumscale.org>

Hi All,

Kirsty wrote a blog post on the inaugural meet the devs in the USA. You
can find it here:

http://www.spectrumscale.org/inaugural-usa-meet-the-devs/

Thanks to Kristy, Bob and Pallavi for organising, the IBM devs and the
group members giving talks.

Simon


From Tomasz.Wolski at ts.fujitsu.com  Wed Oct 21 15:23:54 2015
From: Tomasz.Wolski at ts.fujitsu.com (Wolski, Tomasz)
Date: Wed, 21 Oct 2015 16:23:54 +0200
Subject: [gpfsug-discuss] Intro
Message-ID: <C0D8634A5A8F1644A0EF57B6A6449FE74011920576@ABGEX74E.FSC.NET>

Hi All,

My name is Tomasz Wolski and I?m development engineer at Fujitsu Technology Solutions in Lodz, Poland. We?ve been using GPFS in our main product, which is ETERNUS CS8000, for many years now. GPFS helps us to build a consolidation of backup and archiving solutions for our end customers. We make use of GPFS snapshots, NIFS/CIFS services, GPFS API for our internal components and many many more .. :)

My main responsibility, except developing new features for our system, is integration new GPFS versions into our system and bug tracking GPFS issues.

Best regards,
Tomasz Wolski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151021/e2fab30a/attachment.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct 23 15:04:49 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 23 Oct 2015 14:04:49 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon


From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment.gif>

From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgQ0M024262@d01av05.pok.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment.gif>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPr1G010768@d03av04.boulder.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/156bde83/attachment.htm>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPv9P004320@d01av03.pok.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/c4ff607f/attachment.htm>

From kallbac at iu.edu  Mon Oct 26 02:38:52 2015
From: kallbac at iu.edu (Kallback-Rose, Kristy A)
Date: Sun, 25 Oct 2015 22:38:52 -0400
Subject: [gpfsug-discuss] ILM and Backup Question
Message-ID: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151025/7acc530c/attachment.sig>

From st.graf at fz-juelich.de  Mon Oct 26 08:43:33 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Mon, 26 Oct 2015 09:43:33 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <562DE7B5.7080303@fz-juelich.de>

Hi!

We at J?lich Supercomputing Centre have two ILM managed file systems  (GPFS and HSM from TSM).
    #50 mio files + 10 PB data on tape
    #30 mio files + 8 PB data on tape

For backup we use mmbackup (dsmc)
    for the user HOME directory (no ILM)
    #120 mio files => 3 hours get candidate list + x hour backup

We use also mmbackup for the ILM managed filesystem.
    Policy: the file must be backed up first before migrated to tape
    2-3 hour for candidate list + x hours/days/weeks backups (!!!)
    -> a metadata change (e.g. renaming a directory by the user) enforces a new backup of the files which causes a very expensive tape inline copy!

Greetings from J?lich, Germany
Stephan

On 10/26/15 03:38, Kallback-Rose, Kristy A wrote:

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/85e961b4/attachment.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0001.htm>

From S.J.Thompson at bham.ac.uk  Mon Oct 26 20:15:26 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Mon, 26 Oct 2015 20:15:26 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>

Hi Kristy,

Yes thanks for picking this up.

So we (UoB) have 3 GPFS environments, each with different approaches.

1. OpenStack (GPFS as infrastructure) - we don't back this up at all.
Partly this is because we are still in pilot phase, and partly because we
also have ~7PB CEPH over 4 sites for this project, and the longer term aim
is for us to ensure data sets and important VM images are copied into the
CEPH store (and then replicated to at least 1 other site).

We have some challenges with this, how should we do this? We're sorta
thinging about maybe going down the irods route for this, policy scan the
FS maybe, add xattr onto important data, and use that to get irods to send
copies into CEPH (somehow). So this would be a bit of a hybrid home-grown
solution going on here. Anyone got suggestions about how to approach this?
I know IBM are now an irods consortium member, so any magic coming from
IBM to integrate GFPS and irods?


2. HPC. We differentiate on our HPC file-system between backed up and non
backed up space. Mostly its non backed up, where we encourage users to
keep scratch data sets. We provide a small(ish) home directory which is
backed up with TSM to tape, and also backup applications and system
configs of the system. We use a bunch of jobs to sync some configs into
local git which also is stored in the backed up part of the FS, so things
like switch configs, icinga config can be backed up sanely.


3. Research Data Storage. This is a large bulk data storage solution. So
far its not actually that large (few hundred TB), so we take the
traditional TSM back to tape approach (its also sync replicated between
data centres). We're already starting to see some possible slowness on
this with data ingest and we've only just launched the service. Maybe that
is a cause of launching that we suddenly see high data ingest. We are also
experimenting with HSM to tape, but other than that we have no other ILM
policies - only two tiers of disk, SAS for metadata and NL-SAS for bulk
data. I'd like to see a flash tier in there for Metadata, which would free
SAS drives and so we might be more into ILM policies. We have some more
testing with snapshots to do, and have some questions about recovery of
HSM files if the FS is snapshotted. Anyone any experience with this with
4.1 upwards versions of GPFS? Straight TSM backup for us means we can end
up with 6 copies of data - once per data centre, backup + offsite backup
tape set, HSM pool + offsite copy of HSM pool. (If an HSM tape fails, how
do we know what to restore from backup? Hence we make copies of the HSM
tapes as well).


As our backups run on TSM, it uses the policy engine and mmbackup, so we
only backup changes and new files, and never backup twice from the FS.

Does anyone know how TSM backups handle XATTRs? This is one of the
questions that was raised at meet the devs. Or even other attributes like
immutability, as unless you are in complaint mode, its possible for
immutable files to be deleted in some cases. In fact this is an
interesting topic, it just occurred to me, what happens if your HSM tape
fails and it contained immutable files. Would it be possible to recover
these files if you don't have a copy of the HSM tape? - can you do a
synthetic recreate of the TSM HSM tape from backups?


We typically tell users that backups are for DR purposes, but that we'll
make efforts to try and restore files subject to resource availability.

Is anyone using SOBAR? What is your rationale for this? I can see that at
scale, there are lot of benefits to this. But how do you handle users
corrupting/deleting files etc? My understanding of SOBAR is that it
doesn't give you the same ability to recover versions of files, deletions
etc that straight TSM backup does. (this is something I've been meaning to
raise for a while here).


So what do others do? Do you have similar approaches to not backing up
some types of data/areas? Do you use TSM or home-grown solutions? Or even
other commercial backup solutions? What are your rationales for making
decisions on backup approaches? Has anyone built their own DMAPI type
interface for doing these sorts of things? Snapshots only? Do you allow
users to restore themselves? If you are using ILM, are you doing it with
straight policy, or is TSM playing part of the game?

(If people want to comment anonymously on this without committing their
company on list, happy to take email to the chair@ address and forward on
anonymously to the group).

Simon

On 26/10/2015, 02:38, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Kallback-Rose, Kristy A" <gpfsug-discuss-bounces at spectrumscale.org on
behalf of kallbac at iu.edu> wrote:

>Simon wrote recently in the GPFS UG Blog: "We also got into discussion on
>backup and ILM, and I think its amazing how everyone does these things in
>their own slightly different way. I think this might be an interesting
>area for discussion over on the group mailing list. There's a lot of
>options and different ways to do things!?
>
>Yes, please! I?m *very* interested in what others are doing.
>
>We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS
>Integration?we have had HPSS for a very long time), but I?m interested
>what others are doing with either ILM or other methods to brew their own
>backup solutions, how much they are backing up and with what regularity,
>what resources it takes, etc.
>
>If you have anything going on at your site that?s relevant, can you
>please share?
>
>Thanks,
>Kristy
>
>Kristy Kallback-Rose
>Manager, Research Storage
>Indiana University


From wsawdon at us.ibm.com  Mon Oct 26 21:12:55 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 13:12:55 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562DE7B5.7080303@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
Message-ID: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many
nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by
default it still runs on a single node and you have to specify -N to run it
in parallel.  When you run multi-node there is a "-g" option that defines a
global shared directory that must be visible to all nodes specified in the
-N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that
substantially reduces the time for candidate selection.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/fb91c0e5/attachment.htm>

From wsawdon at us.ibm.com  Mon Oct 26 22:22:58 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 14:22:58 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
Message-ID: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
>
> Does anyone know how TSM backups handle XATTRs?

TSM capture XATTRs and ACLs in an opaque "blob" using gpfs_fgetattrs.
Unfortunately, TSM stores the opaque blob with the file data. Changes to
the blob require the data to be backed up again.


> Or even other attributes like immutability,

Immutable files may be backed up and restored as immutable files.
Immutability is restored after the data has been restored.


> can you do a synthetic recreate of the TSM HSM tape from backups?

TSM stores data from backups and data from HSM in different pools. A file
that is both HSM'ed and backed up will have at least two copies of data
off-line. I suspect that losing a tape from the HSM pool will have no
effect on the backup pool, but you should verify that with someone from
TSM.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/80d94761/attachment.htm>

From st.graf at fz-juelich.de  Tue Oct 27 07:03:19 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Tue, 27 Oct 2015 08:03:19 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
Message-ID: <562F21B7.8040007@fz-juelich.de>

We are running the mmbackup on an AIX system
oslevel -s
6100-07-10-1415
Current GPFS build: "4.1.0.8 ".

So we only use one node for the policy run.

Stephan

On 10/26/15 22:12, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by default it still runs on a single node and you have to specify -N to run it in parallel.  When you run multi-node there is a "-g" option that defines a global shared directory that must be visible to all nodes specified in the -N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that substantially reduces the time for candidate selection.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/92fb60be/attachment.htm>

From kraemerf at de.ibm.com  Tue Oct 27 09:02:52 2015
From: kraemerf at de.ibm.com (Frank Kraemer)
Date: Tue, 27 Oct 2015 10:02:52 +0100
Subject: [gpfsug-discuss] Spectrum Scale v4.2
In-Reply-To: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
References: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
Message-ID: <201510270904.t9R940k4019623@d06av11.portsmouth.uk.ibm.com>

see "IBM Spectrum Scale V4.2 delivers simple, efficient,and intelligent
data management for highperformance,scale-out storage"
http://www.ibm.com/common/ssi/rep_ca/8/897/ENUS215-398/ENUS215-398.PDF

Frank Kraemer
IBM Consulting IT Specialist  / Client Technical Architect
Hechtsheimer Str. 2, 55131 Mainz
mailto:kraemerf at de.ibm.com
voice: +49171-3043699
IBM Germany
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/ac6818eb/attachment.htm>

From jonathan at buzzard.me.uk  Tue Oct 27 10:47:43 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 27 Oct 2015 10:47:43 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
	<201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
Message-ID: <1445942863.17909.89.camel@buzzard.phy.strath.ac.uk>

On Mon, 2015-10-26 at 14:22 -0800, Wayne Sawdon wrote:

[SNIP]

> 
> 
> > can you do a synthetic recreate of the TSM HSM tape from backups?
> 
> TSM stores data from backups and data from HSM in different pools. A
> file that is both HSM'ed and backed up will have at least two copies
> of data off-line. I suspect that losing a tape from the HSM pool will
> have no effect on the backup pool, but you should verify that with
> someone from TSM.
> 

I am pretty sure that you have to restore the files first from backup,
and it is a manual process. Least it was for me when a HSM tape went bad
in the past. Had to use TSM to generate a list of the files on the HSM
tape, and then feed that in to a dsmc restore, before doing a reconcile
and removing the tape from the library for destruction.

Finally all the files where punted back to tape.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From wsawdon at us.ibm.com  Tue Oct 27 15:25:02 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Tue, 27 Oct 2015 07:25:02 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562F21B7.8040007@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
Message-ID: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/4bcf33c9/attachment.htm>

From S.J.Thompson at bham.ac.uk  Tue Oct 27 17:28:00 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Tue, 27 Oct 2015 17:28:00 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>

Hi,

If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.

>From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.

This appears to mean that quotas have to be double what we actually want to take account of the replication factor.

Is this correct?

Second part of the question. If a file is transferred to tape (or compressed maybe as well), does the file still count against quota, and how much for? As on hsm tape its no longer copies=2. Same for a compressed file, does the compressed file count as the original or compressed size against quota? I.e. Could a user accessing a compressed file suddenly go over quota by accessing the file?

Thanks

Simon

From Robert.Oesterlin at nuance.com  Tue Oct 27 19:48:04 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 27 Oct 2015 19:48:04 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <4E539EE4-596B-441C-9E60-46072E567765@nuance.com>

With Spectrum Scale 4.2 announced, can anyone from IBM comment on what the outlook/process is for fixes and PTFs?

When 4.1.1 came out, 4.1.0.X more or less dies, with 4.1.0.8 being the last level ? yes? Then move to 4.1.1
With 4.1.1 ? we are now at 4.1.1-2 and 4.2 is going to GA on 11/20/2015

Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/8ae9efa9/attachment.htm>

From st.graf at fz-juelich.de  Wed Oct 28 08:06:01 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Wed, 28 Oct 2015 09:06:01 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
Message-ID: <563081E9.2090605@fz-juelich.de>

Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/a3c50597/attachment.htm>

From Dan.Foster at bristol.ac.uk  Wed Oct 28 10:06:10 2015
From: Dan.Foster at bristol.ac.uk (Dan Foster)
Date: Wed, 28 Oct 2015 10:06:10 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double
our fileset quotas to take this is to account on replicated
filesystems.

You've reminded me that I was going to ask this list if it's possible
to report the un-replicated sizes? While the quota management is only
a slight pain, what's reported to the user is more of a problem for
us(e.g. via SMB share / df ). We're considering replicating a lot more
of our filesystems and it would be useful if it didn't appear that
everyones quotas had just doubled overnight.

Thanks,
Dan.
-- 
Dan Foster | Senior Storage Systems Administrator | IT Services


From duersch at us.ibm.com  Wed Oct 28 12:47:52 2015
From: duersch at us.ibm.com (Steve Duersch)
Date: Wed, 28 Oct 2015 08:47:52 -0400
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes
and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1
stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/b05c24f3/attachment.htm>

From Robert.Oesterlin at nuance.com  Wed Oct 28 13:06:56 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 28 Oct 2015 13:06:56 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <B560C3B9-2F1B-4BC5-82E8-704D0E7121C6@nuance.com>

Hi Steve

Thanks ? that?s puzzling (surprising?) given that 4.1.1 hasn?t really been out that long. (less than 6 months)  I?m in a position of deciding of what my upgrade path and timeline should be. If I?m at 4.1.0.X and want to upgrade all my clusters, the ?safer? bet is probably 4.1.1-X. but all the new features are going to end up on the 4.2.X.

If 4.2 is going to GA in November, perhaps it?s better to wait for the first 4.2 PTF package.

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, October 28, 2015 at 7:47 AM
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

IBM will continue to create PTFs for the 4.1.1 stream.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/69dfc30c/attachment.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Oct 28 13:09:52 2015
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 28 Oct 2015 13:09:52 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/c5067f1d/attachment.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:15:30 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:15:30 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
	<6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6CF0@CHI-EXCHANGEW1.w2k.jumptrading.com>

IBM has stated that there will no longer be PTF releases for 4.1.0, and that 4.1.0-8 is the last PTF release.  Thus you?ll have to choose between upgrading to 4.1.1 (which has the latest GPFS Protocols feature, hence the numbering change), or wait and go with the 4.2 release.

I heard rumor from somebody at IBM (honestly can?t remember who) that the first 3 releases of any major release has some additional debugging turned up, which is turned off after on the fourth PTF release and those going forward.  Does anybody at IBM want to confirm or deny this rumor?

I?m also leery of going with the first major release of GPFS (or any software, like RHEL 7.0 for instance).

Thanks,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L
Sent: Wednesday, October 28, 2015 8:10 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:

>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2d765948/attachment.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:25:27 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:25:27 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
	<CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6E05@CHI-EXCHANGEW1.w2k.jumptrading.com>

I'm not sure what kind of report you're looking for, but the `du` command has a "--apparent-size" option that has this description:
              print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (?sparse?) files, internal fragmentation, indirect blocks, and the like

This can be used to get the actual amount of space that files are using.

I think that mmrepquota and mmlsquota show twice the amount of space of the actual file due to the replication, but somebody correct me if I'm mistaken.

I also would like to know what the output of the ILM "LIST" policy reports for KB_ALLOCATED for replicated files.  Is it the replicated amount of data?

Thanks,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Dan Foster
Sent: Wednesday, October 28, 2015 5:06 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Quotas, replication and hsm

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double our fileset quotas to take this is to account on replicated filesystems.

You've reminded me that I was going to ask this list if it's possible to report the un-replicated sizes? While the quota management is only a slight pain, what's reported to the user is more of a problem for us(e.g. via SMB share / df ). We're considering replicating a lot more of our filesystems and it would be useful if it didn't appear that everyones quotas had just doubled overnight.

Thanks,
Dan.
--
Dan Foster | Senior Storage Systems Administrator | IT Services _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From wsawdon at us.ibm.com  Wed Oct 28 13:36:27 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 05:36:27 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <563081E9.2090605@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
	<563081E9.2090605@fz-juelich.de>
Message-ID: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>


You have to use both options even if -N is only the local node. Sorry,

-Wayne


From:	Stephan Graf <st.graf at fz-juelich.de>
To:	<gpfsug-discuss at spectrumscale.org>
Date:	10/28/2015 01:06 AM
Subject:	Re: [gpfsug-discuss] ILM and Backup Question
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use
the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:


      > From: Stephan Graf <st.graf at fz-juelich.de>

      > We are running the mmbackup on an AIX system
      > oslevel -s
      > 6100-07-10-1415
      > Current GPFS build: "4.1.0.8 ".
      >
      > So we only use one node for the policy run.
      >

      Even on one node you should see a speedup using -g and -N.

      -Wayne


      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at spectrumscale.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment.gif>

From wsawdon at us.ibm.com  Wed Oct 28 14:11:25 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 06:11:25 -0800
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <201510281412.t9SEChQo030691@d01av03.pok.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
> <S.J.Thompson at bham.ac.uk>
>
> Second part of the question. If a file is transferred to tape (or
> compressed maybe as well), does the file still count against quota,
> and how much for? As on hsm tape its no longer copies=2. Same for a
> compressed file, does the compressed file count as the original or
> compressed size against quota? I.e. Could a user accessing a
> compressed file suddenly go over quota by accessing the file?
>

Quotas account for space in the file system. If you migrate a user's file
to tape, then that user is credited for the space saved. If a later access
recalls the file then the user is again charged for the space. Note that
HSM recall is done as "root" which bypasses the quota check -- this allows
the file to be recalled even if it pushes the user past his quota limit.

Compression (which is currently in beta) has the same properties. If you
compress a file, then the user is credited with the space saved. When the
file is uncompressed the user is again charged. Since uncompression is done
by the "user" the quota check is enforced and uncompression can fail. This
includes writes to a compressed file.


> From: Bryan Banister <bbanister at jumptrading.com>
>
> I also would like to know what the output of the ILM "LIST" policy
> reports for KB_ALLOCATED for replicated files.  Is it the replicated
> amount of data?
>
KB_ALLOCATED shows the same value that stat shows, So yes it shows the
replicated amount of data actually used by the file.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/f7c66e9b/attachment.htm>

From makaplan at us.ibm.com  Wed Oct 28 14:48:11 2015
From: makaplan at us.ibm.com (makaplan at us.ibm.com)
Date: Wed, 28 Oct 2015 09:48:11 -0500
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>	<563081E9.2090605@fz-juelich.de>
	<201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
Message-ID: <201510281448.t9SEmFsr030044@d01av02.pok.ibm.com>

IF you see one or more status messages like this:

[I] %2$s Parallel-piped sort and policy evaluation. %1$llu files scanned. 
%3$s

Then you are getting the (potentially) fastest version of the GPFS inode 
and policy scanning algorithm.
You may also want to adjust the -a and -A options of the mmapplypolicy 
command, as mentioned in the command documentation.

Oh I see the documentation for -A is wrong in many versions of the manual. 
 There is an attempt to automagically estimate the proper number of
buckets, based on the inodes allocated count.  If you want to investigate 
performance more I recommend you use our debug option -d 7 or set 
environment
variable MM_POLICY_DEBUG_BITS=7 - this will show you how the work is 
divided among the nodes and threads.

--marc of GPFS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2dc8f229/attachment.htm>

From knop at us.ibm.com  Thu Oct 29 14:14:58 2015
From: knop at us.ibm.com (Felipe Knop)
Date: Thu, 29 Oct 2015 09:14:58 -0500
Subject: [gpfsug-discuss] Intro (new member)
Message-ID: <OFEF991404.2B0D7580-ON85257EED.004CBAEA-85257EED.004E448D@notes.na.collabserv.com>

Hi,

I have just joined the GPFS (Spectrum Scale) UG list. I work in the GPFS 
development team.

I had the chance of attending the "Inaugural USA Meet the Devs" session in 
New York City on Oct 7, which was a valuable opportunity to hear from 
customers using the product.

  Felipe

----
Felipe Knop                                     knop at us.ibm.com
GPFS Development
IBM Systems
IBM Building 008
2455 South Rd, Poughkeepsie, NY 12601
(845) 433-9314  T/L 293-9314


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151029/fbb17b8a/attachment.htm>

From carlz at us.ibm.com  Fri Oct 30 15:14:50 2015
From: carlz at us.ibm.com (Carl Zetie)
Date: Fri, 30 Oct 2015 10:14:50 -0500
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
Message-ID: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>

First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/15f2a056/attachment.htm>

From jfhamano at us.ibm.com  Fri Oct 30 15:29:58 2015
From: jfhamano at us.ibm.com (John Hamano)
Date: Fri, 30 Oct 2015 07:29:58 -0800
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
In-Reply-To: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
References: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
Message-ID: <201510301530.t9UFUM0M004729@d03av05.boulder.ibm.com>

Hi Carl, welcome and congratulations on your new role.  I am North America 
Brand Sales for ESS and Spectrum Scale.   Let me know when you have some 
time next weekg to talk.


From:   Carl Zetie/Fairfax/IBM at IBMUS
To:     gpfsug-discuss at spectrumscale.org, 
Date:   10/30/2015 08:20 AM
Subject:        [gpfsug-discuss] Making an RFE Public (and an intro)
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/4527c35a/attachment.htm>

From PATBYRNE at uk.ibm.com  Thu Oct  1 11:09:29 2015
From: PATBYRNE at uk.ibm.com (Patrick Byrne)
Date: Thu, 1 Oct 2015 10:09:29 +0000
Subject: [gpfsug-discuss] Problem Determination
Message-ID: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/6e900b95/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Thu Oct  1 13:39:25 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Thu, 1 Oct 2015 12:39:25 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/856fd025/attachment-0001.htm>

From bbanister at jumptrading.com  Fri Oct  2 17:44:24 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Fri, 2 Oct 2015 16:44:24 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05C8CE44@CHI-EXCHANGEW1.w2k.jumptrading.com>

I would like to strongly echo what Bob has stated, especially the documentation or wrong documentation, and I have in-lining some comments below.

I liken GPFS to a critical care patient at the hospital.  You have to check on the state regularly, know the running heart rate (e.g. waiters), the response of every component from disk, to networks, to server load, etc.  When a problem occurs, running tests (such as nsdperf)  to help isolate the problem quickly is crucial.  Capturing GPFS trace data is also very important if the problem isn?t obvious.  But then you have to wait for IBM support to parse the information and give you their analysis of the situation.  It would be great to get an advanced troubleshooting document that describes how to read the output of `mmfsadm dump` commands and the GPFS trace report that is generated.

Cheers,
-Bryan

From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Oesterlin, Robert
Sent: Thursday, October 01, 2015 7:39 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
                [Bryan: Also please, please provide a way to check whether or not the configuration parameters need to be changed.  I assume that there is a `mmfsadm dump` command that can tell you whether the config parameter needs to be changed, if not make one!  Just stating something like ?This could be increased to XX value for very large clusters? is not very helpful.

- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
                [Bryan: I know that Scott Fadden is a busy man, so I would recommend helping distribute the workload of maintaining the wiki documentation.  This data should be reviewed on a more regular basis, at least once for each major release I  would hope, and updated or deleted if found to be out of date.]

- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
                [Bryan: From what I?ve heard, IBM is actively working to make the deadlock amelioration logic better.  I agree that firing off traces can cause more problems, and we have turned off the automated collection as well.  We are going to work on enabling the collection of some data during these events to help ensure we get enough data for IBM to analyze the problem.]

- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.
                [Bryan: The GPFS callback facilities are very useful for setting up alerts, but not well documented or advertised by the GPFS manuals.  I hope to see more callback capabilities added to help monitor all aspects of the GPFS cluster and file systems]


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/d47c9d15/attachment-0001.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct  2 17:58:41 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 2 Oct 2015 16:58:41 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>,
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545C7BCB331@EX10.adf.bham.ac.uk>


I agree on docs, particularly on mmdiag, I think things like --lroc are not documented. I'm also not sure that --network always gives accurate network stats. (we were doing some ha failure testing where we have split site in and fabrics, yet the network counters didn't change even when the local ib nsd servers were shut down).

It would be nice also to have a set of Icinga/Nagios plugins from IBM, maybe in samples whcich are updated on each release with new feature checks.

And not problem determination, but id really like to see an inflight non disruptive upgrade path. Particularly as we run vms off gpfs, its bot always practical or possible to move vms, so would be nice to have upgrade in flight (not suggesting this would be a quick thing to implement).

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Oesterlin, Robert [Robert.Oesterlin at nuance.com]
Sent: 01 October 2015 13:39
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


From ewahl at osc.edu  Fri Oct  2 19:00:46 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 2 Oct 2015 18:00:46 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>

I'm not yet in the 4.x release stream so this may be taken with a grain (or more) of salt as we say.

PLEASE keep the ability of commands to set -x or dump debug when the env DEBUG=1 is set.  This has been extremely useful over the years.   Granted I've never worked out why sometimes we see odd little  things like machines deciding they suddenly need an FPO license or one nsd server suddenly decides it's name is part of the FQDN instead of just it's hostname and only for certain commands, but it's DAMN useful.  Minor issues especially can be tracked down with it.

Undocumented features and logged items abound.  I'd say start there.  This is one area where it is definitely more art than science with Spectrum Scale (meh GPFS still sounds better. So does Shark. Can we go back to calling it the Shark Server Project?)

  Complete failure of the verbs layer and fallback to other defined networks would be nice to know about during operation. It's excellent about telling you at startup but not so much during operation, at least in 3.5.

 I imagine with the 'automated compatibility layer building' I'll be looking for some serious amounts of PD for the issues we _will_ see there.  We frequently build against kernels we are not yet running at this site, so this needs well documented PD and resolution.

Ed Wahl
OSC


________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
Sent: Thursday, October 01, 2015 6:09 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/5fa79ef0/attachment-0001.htm>

From zgiles at gmail.com  Fri Oct  2 21:27:17 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 2 Oct 2015 16:27:17 -0400
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <CAMYZk=dnEN_uuAuC4Pk3AzmrPD22mX4ssVSqnKrJ1N1mKUXgpQ@mail.gmail.com>

I would like to see better performance metrics / counters from GPFS.
I know we already have mmpmon, which is generally really good -- I've
done some fun things with it and it has been a great tool. And, I
realize that there is supposedly a new monitoring framework in 4.x..
which I haven't played with yet.

But,
Generally it would be extremely helpful to get synchronized (across
all nodes) high accuracy counters of data flow, number of waiters,
page pool stats, distribution of data from one layer to another down
to NSDs.. etc etc etc.  I believe many of these counters already
exist, but they're hidden in some mmfsadm xx command that one needs to
troll through with possible performance implications. mmpmon can do
some of this, but it's only a handful of counters, it's hard to say
how synchronized the counters are across nodes, and I've personally
seen an mmpmon run go bad and take down a cluster.  It would be nice
if it were pushed out, or provided in a safe manner with the design
and expectation of "log-everything forever continuously".

As GSS/ESS systems start popping up, I realize they have this other
monitoring framework to watch the VD throughputs.. which is great.
But, that doesn't allow us to monitor more traditional types.
Would be nice to monitor it all together the same way so we don't
miss-out on monitoring half the infrastructure or buying a cluster
with some fancy GUI that can't do what we want..

-Zach


On Fri, Oct 2, 2015 at 2:00 PM, Wahl, Edward <ewahl at osc.edu> wrote:
> I'm not yet in the 4.x release stream so this may be taken with a grain (or
> more) of salt as we say.
>
> PLEASE keep the ability of commands to set -x or dump debug when the env
> DEBUG=1 is set.  This has been extremely useful over the years.   Granted
> I've never worked out why sometimes we see odd little  things like machines
> deciding they suddenly need an FPO license or one nsd server suddenly
> decides it's name is part of the FQDN instead of just it's hostname and only
> for certain commands, but it's DAMN useful.  Minor issues especially can be
> tracked down with it.
>
> Undocumented features and logged items abound.  I'd say start there.  This
> is one area where it is definitely more art than science with Spectrum Scale
> (meh GPFS still sounds better. So does Shark. Can we go back to calling it
> the Shark Server Project?)
>
>   Complete failure of the verbs layer and fallback to other defined networks
> would be nice to know about during operation. It's excellent about telling
> you at startup but not so much during operation, at least in 3.5.
>
>  I imagine with the 'automated compatibility layer building' I'll be looking
> for some serious amounts of PD for the issues we _will_ see there.  We
> frequently build against kernels we are not yet running at this site, so
> this needs well documented PD and resolution.
>
> Ed Wahl
> OSC
>
>
> ________________________________
> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org]
> on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
> Sent: Thursday, October 01, 2015 6:09 AM
> To: gpfsug-discuss at gpfsug.org
> Subject: [gpfsug-discuss] Problem Determination
>
> Hi all,
>
> As I'm sure some of you aware, problem determination is an area that we are
> looking to try and make significant improvements to over the coming releases
> of Spectrum Scale. To help us target the areas we work to improve and make
> it as useful as possible I am trying to get as much feedback as I can about
> different problems users have, and how people go about solving them.
>
> I am interested in hearing everything from day to day annoyances to problems
> that have caused major frustration in trying to track down the root cause.
> Where possible it would be great to hear how the problems were dealt with as
> well, so that others can benefit from your experience. Feel free to reply to
> the mailing list - maybe others have seen similar problems and could provide
> tips for the future - or to me directly if you'd prefer
> (patbyrne at uk.ibm.com).
>
> On a related note, in 4.1.1 there was a component added that monitors the
> state of the various protocols that are now supported (NFS, SMB, Object).
> The output from this is available with the 'mmces state' and 'mmces events'
> CLIs and I would like to get feedback from anyone who has had the chance
> make use of this. Is it useful? How could it be improved? We are looking at
> the possibility of extending this component to cover more than just
> protocols, so any feedback would be greatly appreciated.
>
> Thanks in advance,
>
> Patrick Byrne
> IBM Spectrum Scale - Development Engineer
> IBM Systems - Manchester Lab
> IBM UK Limited
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
Zach Giles
zgiles at gmail.com


From Luke.Raimbach at crick.ac.uk  Mon Oct  5 13:57:14 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Mon, 5 Oct 2015 12:57:14 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <AMXPR03MB0216D4B7DBCDE69469121EFB0480@AMXPR03MB021.eurprd03.prod.outlook.com>

Hi All,

When creating an independent inode space, I see the valid range for the number of inodes is between 1024 and 4294967294.

Is the ~4.2billion upper limit something that can be increased in the future?

I also see that the first 1024 inodes are immediately allocated upon creation. I assume these are allocated to internal data structures and are a copy of a subset of the first 4038 inodes allocated for new file systems? It would be useful to know if these internal structures are fixed for independent filesets and if they are not, what factors determine their layout (for performance purposes).

Many Thanks,
Luke.

Luke Raimbach?
Senior HPC Data and Storage Systems Engineer,
The Francis Crick Institute,
Gibbs Building,
215 Euston Road,
London NW1 2BE.

E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.

From usa-principal at gpfsug.org  Mon Oct  5 14:55:15 2015
From: usa-principal at gpfsug.org (usa-principal-gpfsug.org)
Date: Mon, 05 Oct 2015 09:55:15 -0400
Subject: [gpfsug-discuss] Final Reminder: Inaugural US "Meet the Developers"
Message-ID: <9656d0110c2be4b339ec5ce662409b8e@webmail.gpfsug.org>

A last reminder to check in with Janet if you have not done so already. 
Looking forward to this event on Wednesday this week.

Best,
Kristy

---

Hello Everyone,

Here is a reminder about our inaugural US "Meet the Developers" session. 
  Details are below, and please send an e-mail to Janet Ellsworth 
(janetell at us.ibm.com) by next Friday September 18th if you wish to 
attend. Janet is on the product management team for Spectrum Scale and 
is helping with the logistics for this first event.

Date:  Wednesday, October 7th
Place: IBM building at 590 Madison Avenue, New York City
Time:  12:30 to 5 PM (Lunch will be served at 12:30, and sessions will 
start between 1 and 1:30 PM.  Afternoon snacks will be served as well 
:-)

Agenda
IBM development architect to present the new protocols support that was 
released with Spectrum Scale 4.1.1 in June.
IBM developer to demo future Graphical User Interface
***Member of user community to present an experience with using Spectrum 
Scale (still seeking volunteers for this !)***
Open Q&A with the development team

We are happy to have heard from many of you so far who would like to 
attend.   We still have room however, so please get in touch by the 9/18 
date if you would like to attend.

***We also need someone to share an experience or use case scenario with 
Spectrum Scale for this event, so please let Janet know if you are 
willing to do that too.***

As you have likely seen, we are also working on the agenda and timing 
for day-long GPFS US UG event in Austin during November aligned with 
SC15 and there will be more details on that coming soon.


From secretary at gpfsug.org  Wed Oct  7 12:50:51 2015
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Wed, 07 Oct 2015 12:50:51 +0100
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <813d82bd5074b90c3a67acc85a03995b@webmail.gpfsug.org>

Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org


From service at metamodul.com  Wed Oct  7 16:06:56 2015
From: service at metamodul.com (service at metamodul.com)
Date: Wed, 07 Oct 2015 17:06:56 +0200
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <me1pxc2iifvcntr6uhjdasup.1444230416911@email.android.com>

Hi Claire,
I will attend the meeting.
Hans-Joachim Ehlers
MetaModul GmbH
Germany

Cheers
Hajo
Von Samsung Mobile gesendet

<div>-------- Urspr?ngliche Nachricht --------</div><div>Von: Secretary GPFS UG <secretary at gpfsug.org> </div><div>Datum:2015.10.07  13:50  (GMT+01:00) </div><div>An: gpfsug main discussion list <gpfsug-discuss at gpfsug.org> </div><div>Betreff: [gpfsug-discuss] Places available: Meet the Devs </div><div>
</div>Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/53b57a09/attachment-0001.htm>

From Douglas.Hughes at DEShawResearch.com  Wed Oct  7 19:59:26 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Wed, 7 Oct 2015 18:59:26 +0000
Subject: [gpfsug-discuss] new member, first post
Message-ID: <FE83190899DE5B41BD7B5C3FD84CDF8F6199300D@mailnycmb5a.winmail.deshaw.com>


sitting here in the US GPFS UG meeting in NYC and just found out about this list.

We've been a GPFS user for many years, first with integrated DDN support, but now also with a GSS system. we have about 4PB of raw GPFS storage and 1 billion inodes. We keep our metadata on TMS ramsan for very fast policy execution for tiering and migration.

We use GPFS to hold the primary source data from our custom supercomputers. We have many policies executed periodically for managing the data, including writing certain files to dedicated fast pools and then migrating the data off to wide swaths of disk for read access from cluster clients.

One pain point, which I'm sure many of the rest of you have seen, restripe operations for just metadata are unnecessarily slow. If we experience a flash module failure and need to restripe, it also has to check all of the data. I have a feature request open to make metadata restripes only look at metadata (since it is on RamSan/FlashCache, this should be very fast) instead of scanning everything, which can and does take months with performance impacts.

Doug Hughes
D. E. Shaw Research, LLC.

Sent from my android device.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/77b16dfa/attachment-0001.htm>

From chair at gpfsug.org  Thu Oct  8 20:37:05 2015
From: chair at gpfsug.org (GPFS UG Chair (Simon Thompson))
Date: Thu, 08 Oct 2015 20:37:05 +0100
Subject: [gpfsug-discuss] User group update
Message-ID: <D23C8471.1E4E7%chair@gpfsug.org>

Hi,

I thought I'd drop an update to the group on various admin things which
have been going on behind the scenes.

The first US meet the devs event was held yesterday, and I'm hoping
someone who went will be preparing a blog post to cover the event a
little. I know a bunch of people have joined the mailing list since then,
so welcome to the group to all of those!


** User Group Engagement with IBM **

I also met with Akhtar yesterday who is the IBM VP for Technical Computing
Developments (which includes Spectrum Scale). He was in the UK for a few
days at the IBM Manchester Labs, so we managed to squeeze a meeting to
talk a bit about the UG. I'm very pleased that Akhtar confirmed IBMs
commitment to help the user group in both the UK and USA with developer
support for the meet the devs and annual group meetings. I'd like to
extend my thanks to those at IBM who are actively supporting the group in
so many ways.

One idea we have been mulling over is filming the talks at next year's
events and then putting those on Youtube for people who can't get there.
IBM have given us tentative agreement to do this, subject to a few
conditions. Most importantly that the UG and IBM ensure we don't publish
customer or IBM items which are NDA/not for general public consumption.
I'm hopeful we can get this all approved and if we do, we'll be looking to
the community to help us out (anyone got digital camera equipment we might
be able to borrow, or some help with editing down afterwards?)

Whilst in Manchester I also met with Patrick to talk over the various
emails people have sent in about problem determination, which Patrick will
be taking to the dev meeting in a few weeks. It sounds like there are some
interesting ideas kicking about, so hopefully we'll get some value from
the user group input.

Some of the new features in 4.2 were also demo'd and for those who might
not have been to a meet the devs session and are interested in the
upcoming GUI, it is now in public beta, head over to developer works for
more details:
  
https://www.ibm.com/developerworks/community/forums/html/topic?id=4dc34bf1-
17d1-4dc0-af72-6dc5a3f93e82&ps=25


** User Group Feedback **

Over the past few months, I've also been collecting feedback from people,
either comments on the mailing list, or those who I've spoken to, which
was all collated and sent in to IBM, we'll hopefully be getting some
feedback on that in the next few weeks - there's a bunch of preliminary
answers now, but a few places we still need a bit of clarification.

There's also some longer term discussion going on about GPFS and cloud (in
particular to those of us in scientific areas). We'll feed that back as
and when we get responses we can share.

We'd like to ensure that we gather as much feedback from users so that we
can collectively take it to IBM, so please do continue to post comments
etc to the mailing list.


** Diary Dates **

A few dates for diaries:
  * Meet the Devs in Edinburgh - Friday 23rd October 2015
  * GPFS UG Meeting @ SC15 in Austin, USA - Sunday 15th November 2015
  * GPFS UG Meeting @ Computing Insight UK, Coventry, UK - Tuesday 8th
December 2015 (Note you must be registered also for CIUK)
  * GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
2016


** User Group Admin **

Within the committee, we've been talking about how we can extend the reach
of the group, so we may be reaching out to a few group members to take
this forward. Of course if anyone has suggestions on how we can ensure we
reach as many people as possible, please let me know, either via the
mailing list of directly by email.
I know there are lot of people on the mailing list who don't post
(regularly), so I'd be interested to hear if you find the group mailing
list discussion useful, if you feel there are barriers to asking
questions, or what you'd like to see coming out of the user group - please
feel free to email me directly if you'd like to comment on any of this!


We've also registered spectrumscale.org to point to the user group, so you
may start to see the group marketed as the Spectrum Scale User Group, but
rest assured, its still the same old GPFS User Group ;-)


Just a reminder that we made the mailing list so that only members can
post. This was to reduce the amount of spam coming in and being held for
moderation (and a few legit posts got lost this way). If you do want to
post, but not receive the emails, you can set this as an option in the
mailing list software.

Finally, I've also fixed the mailing list archives, so these are now
available at:
  http://www.gpfsug.org/pipermail/gpfsug-discuss/

Simon
GPFS UG, UK Chair


From L.A.Hurst at bham.ac.uk  Fri Oct  9 09:25:52 2015
From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst (IT Services))
Date: Fri, 9 Oct 2015 08:25:52 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23D3808.E75E%l.a.hurst@bham.ac.uk>

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
-- 
Laurence Hurst
Research Support, IT Services, University of Birmingham


From S.J.Thompson at bham.ac.uk  Fri Oct  9 10:00:11 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 9 Oct 2015 09:00:11 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
References: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D166BF67@EX10.adf.bham.ac.uk>


Both days. May 2016 is a two day event.

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Laurence Alexander Hurst (IT Services) [L.A.Hurst at bham.ac.uk]
Sent: 09 October 2015 09:25
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] User group update

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
--
Laurence Hurst
Research Support, IT Services, University of Birmingham


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:54:22 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:54:22 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>

>
>We've also registered spectrumscale.org to point to the user group, so you
>may start to see the group marketed as the Spectrum Scale User Group, but
>rest assured, its still the same old GPFS User Group ;-)

And this is just a test mail to ensure that mail to
gpfsug-discuss at spectrumscale.org gets through OK. The old address should
also still work.

Simon


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:55:55 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:55:55 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
References: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
Message-ID: <D23ED768.1E5A5%s.j.thompson@bham.ac.uk>


On 10/10/2015 14:54, "Simon Thompson (Research Computing - IT Services)"
<S.J.Thompson at bham.ac.uk> wrote:

>>
>>We've also registered spectrumscale.org to point to the user group, so
>>you
>>may start to see the group marketed as the Spectrum Scale User Group, but
>>rest assured, its still the same old GPFS User Group ;-)
>
>And this is just a test mail to ensure that mail to
>gpfsug-discuss at spectrumscale.org gets through OK. The old address should
>also still work.

And checking the old address still works fine as well.

Simon


From Robert.Oesterlin at nuance.com  Tue Oct 13 03:03:45 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 13 Oct 2015 02:03:45 +0000
Subject: [gpfsug-discuss] User group Meeting at SC15 - Registration
Message-ID: <F1459A73-6A82-4935-8518-C3F88DF2C794@nuance.com>

We?d like to have all those attending the user group meeting at SC15 to register ? details are below. Thanks to IBM for getting the space and arranging all the details. I?ll post a more detailed agenda soon.

Looking forward to meeting everyone!

Location:
JW Marriott
110 E 2nd Street
Austin, Texas
United States

Date and Time:
Sunday Nov 15, 1:00 PM?5:30 PM

Agenda:

- Latest IBM Spectrum Scale enhancements
- Future directions and roadmap* (NDA required)
- Newer usecases and User presentations

Registration:
Please register at the below link to book your seat. <https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales>
https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales


Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151013/23eca25e/attachment-0001.htm>

From chair at spectrumscale.org  Sat Oct 17 20:51:50 2015
From: chair at spectrumscale.org (GPFS UG Chair (Simon Thompson))
Date: Sat, 17 Oct 2015 20:51:50 +0100
Subject: [gpfsug-discuss] Blog on USA Meet the Devs
Message-ID: <D2486566.1EFE5%chair@spectrumscale.org>

Hi All,

Kirsty wrote a blog post on the inaugural meet the devs in the USA. You
can find it here:

http://www.spectrumscale.org/inaugural-usa-meet-the-devs/

Thanks to Kristy, Bob and Pallavi for organising, the IBM devs and the
group members giving talks.

Simon


From Tomasz.Wolski at ts.fujitsu.com  Wed Oct 21 15:23:54 2015
From: Tomasz.Wolski at ts.fujitsu.com (Wolski, Tomasz)
Date: Wed, 21 Oct 2015 16:23:54 +0200
Subject: [gpfsug-discuss] Intro
Message-ID: <C0D8634A5A8F1644A0EF57B6A6449FE74011920576@ABGEX74E.FSC.NET>

Hi All,

My name is Tomasz Wolski and I?m development engineer at Fujitsu Technology Solutions in Lodz, Poland. We?ve been using GPFS in our main product, which is ETERNUS CS8000, for many years now. GPFS helps us to build a consolidation of backup and archiving solutions for our end customers. We make use of GPFS snapshots, NIFS/CIFS services, GPFS API for our internal components and many many more .. :)

My main responsibility, except developing new features for our system, is integration new GPFS versions into our system and bug tracking GPFS issues.

Best regards,
Tomasz Wolski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151021/e2fab30a/attachment-0001.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct 23 15:04:49 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 23 Oct 2015 14:04:49 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon


From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment-0001.gif>

From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgQ0M024262@d01av05.pok.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment-0001.gif>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPr1G010768@d03av04.boulder.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/156bde83/attachment-0001.htm>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPv9P004320@d01av03.pok.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/c4ff607f/attachment-0001.htm>

From kallbac at iu.edu  Mon Oct 26 02:38:52 2015
From: kallbac at iu.edu (Kallback-Rose, Kristy A)
Date: Sun, 25 Oct 2015 22:38:52 -0400
Subject: [gpfsug-discuss] ILM and Backup Question
Message-ID: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151025/7acc530c/attachment-0001.sig>

From st.graf at fz-juelich.de  Mon Oct 26 08:43:33 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Mon, 26 Oct 2015 09:43:33 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <562DE7B5.7080303@fz-juelich.de>

Hi!

We at J?lich Supercomputing Centre have two ILM managed file systems  (GPFS and HSM from TSM).
    #50 mio files + 10 PB data on tape
    #30 mio files + 8 PB data on tape

For backup we use mmbackup (dsmc)
    for the user HOME directory (no ILM)
    #120 mio files => 3 hours get candidate list + x hour backup

We use also mmbackup for the ILM managed filesystem.
    Policy: the file must be backed up first before migrated to tape
    2-3 hour for candidate list + x hours/days/weeks backups (!!!)
    -> a metadata change (e.g. renaming a directory by the user) enforces a new backup of the files which causes a very expensive tape inline copy!

Greetings from J?lich, Germany
Stephan

On 10/26/15 03:38, Kallback-Rose, Kristy A wrote:

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/85e961b4/attachment-0001.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0002.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0003.htm>

From S.J.Thompson at bham.ac.uk  Mon Oct 26 20:15:26 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Mon, 26 Oct 2015 20:15:26 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>

Hi Kristy,

Yes thanks for picking this up.

So we (UoB) have 3 GPFS environments, each with different approaches.

1. OpenStack (GPFS as infrastructure) - we don't back this up at all.
Partly this is because we are still in pilot phase, and partly because we
also have ~7PB CEPH over 4 sites for this project, and the longer term aim
is for us to ensure data sets and important VM images are copied into the
CEPH store (and then replicated to at least 1 other site).

We have some challenges with this, how should we do this? We're sorta
thinging about maybe going down the irods route for this, policy scan the
FS maybe, add xattr onto important data, and use that to get irods to send
copies into CEPH (somehow). So this would be a bit of a hybrid home-grown
solution going on here. Anyone got suggestions about how to approach this?
I know IBM are now an irods consortium member, so any magic coming from
IBM to integrate GFPS and irods?


2. HPC. We differentiate on our HPC file-system between backed up and non
backed up space. Mostly its non backed up, where we encourage users to
keep scratch data sets. We provide a small(ish) home directory which is
backed up with TSM to tape, and also backup applications and system
configs of the system. We use a bunch of jobs to sync some configs into
local git which also is stored in the backed up part of the FS, so things
like switch configs, icinga config can be backed up sanely.


3. Research Data Storage. This is a large bulk data storage solution. So
far its not actually that large (few hundred TB), so we take the
traditional TSM back to tape approach (its also sync replicated between
data centres). We're already starting to see some possible slowness on
this with data ingest and we've only just launched the service. Maybe that
is a cause of launching that we suddenly see high data ingest. We are also
experimenting with HSM to tape, but other than that we have no other ILM
policies - only two tiers of disk, SAS for metadata and NL-SAS for bulk
data. I'd like to see a flash tier in there for Metadata, which would free
SAS drives and so we might be more into ILM policies. We have some more
testing with snapshots to do, and have some questions about recovery of
HSM files if the FS is snapshotted. Anyone any experience with this with
4.1 upwards versions of GPFS? Straight TSM backup for us means we can end
up with 6 copies of data - once per data centre, backup + offsite backup
tape set, HSM pool + offsite copy of HSM pool. (If an HSM tape fails, how
do we know what to restore from backup? Hence we make copies of the HSM
tapes as well).


As our backups run on TSM, it uses the policy engine and mmbackup, so we
only backup changes and new files, and never backup twice from the FS.

Does anyone know how TSM backups handle XATTRs? This is one of the
questions that was raised at meet the devs. Or even other attributes like
immutability, as unless you are in complaint mode, its possible for
immutable files to be deleted in some cases. In fact this is an
interesting topic, it just occurred to me, what happens if your HSM tape
fails and it contained immutable files. Would it be possible to recover
these files if you don't have a copy of the HSM tape? - can you do a
synthetic recreate of the TSM HSM tape from backups?


We typically tell users that backups are for DR purposes, but that we'll
make efforts to try and restore files subject to resource availability.

Is anyone using SOBAR? What is your rationale for this? I can see that at
scale, there are lot of benefits to this. But how do you handle users
corrupting/deleting files etc? My understanding of SOBAR is that it
doesn't give you the same ability to recover versions of files, deletions
etc that straight TSM backup does. (this is something I've been meaning to
raise for a while here).


So what do others do? Do you have similar approaches to not backing up
some types of data/areas? Do you use TSM or home-grown solutions? Or even
other commercial backup solutions? What are your rationales for making
decisions on backup approaches? Has anyone built their own DMAPI type
interface for doing these sorts of things? Snapshots only? Do you allow
users to restore themselves? If you are using ILM, are you doing it with
straight policy, or is TSM playing part of the game?

(If people want to comment anonymously on this without committing their
company on list, happy to take email to the chair@ address and forward on
anonymously to the group).

Simon

On 26/10/2015, 02:38, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Kallback-Rose, Kristy A" <gpfsug-discuss-bounces at spectrumscale.org on
behalf of kallbac at iu.edu> wrote:

>Simon wrote recently in the GPFS UG Blog: "We also got into discussion on
>backup and ILM, and I think its amazing how everyone does these things in
>their own slightly different way. I think this might be an interesting
>area for discussion over on the group mailing list. There's a lot of
>options and different ways to do things!?
>
>Yes, please! I?m *very* interested in what others are doing.
>
>We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS
>Integration?we have had HPSS for a very long time), but I?m interested
>what others are doing with either ILM or other methods to brew their own
>backup solutions, how much they are backing up and with what regularity,
>what resources it takes, etc.
>
>If you have anything going on at your site that?s relevant, can you
>please share?
>
>Thanks,
>Kristy
>
>Kristy Kallback-Rose
>Manager, Research Storage
>Indiana University


From wsawdon at us.ibm.com  Mon Oct 26 21:12:55 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 13:12:55 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562DE7B5.7080303@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
Message-ID: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many
nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by
default it still runs on a single node and you have to specify -N to run it
in parallel.  When you run multi-node there is a "-g" option that defines a
global shared directory that must be visible to all nodes specified in the
-N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that
substantially reduces the time for candidate selection.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/fb91c0e5/attachment-0001.htm>

From wsawdon at us.ibm.com  Mon Oct 26 22:22:58 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 14:22:58 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
Message-ID: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
>
> Does anyone know how TSM backups handle XATTRs?

TSM capture XATTRs and ACLs in an opaque "blob" using gpfs_fgetattrs.
Unfortunately, TSM stores the opaque blob with the file data. Changes to
the blob require the data to be backed up again.


> Or even other attributes like immutability,

Immutable files may be backed up and restored as immutable files.
Immutability is restored after the data has been restored.


> can you do a synthetic recreate of the TSM HSM tape from backups?

TSM stores data from backups and data from HSM in different pools. A file
that is both HSM'ed and backed up will have at least two copies of data
off-line. I suspect that losing a tape from the HSM pool will have no
effect on the backup pool, but you should verify that with someone from
TSM.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/80d94761/attachment-0001.htm>

From st.graf at fz-juelich.de  Tue Oct 27 07:03:19 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Tue, 27 Oct 2015 08:03:19 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
Message-ID: <562F21B7.8040007@fz-juelich.de>

We are running the mmbackup on an AIX system
oslevel -s
6100-07-10-1415
Current GPFS build: "4.1.0.8 ".

So we only use one node for the policy run.

Stephan

On 10/26/15 22:12, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by default it still runs on a single node and you have to specify -N to run it in parallel.  When you run multi-node there is a "-g" option that defines a global shared directory that must be visible to all nodes specified in the -N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that substantially reduces the time for candidate selection.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/92fb60be/attachment-0001.htm>

From kraemerf at de.ibm.com  Tue Oct 27 09:02:52 2015
From: kraemerf at de.ibm.com (Frank Kraemer)
Date: Tue, 27 Oct 2015 10:02:52 +0100
Subject: [gpfsug-discuss] Spectrum Scale v4.2
In-Reply-To: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
References: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
Message-ID: <201510270904.t9R940k4019623@d06av11.portsmouth.uk.ibm.com>

see "IBM Spectrum Scale V4.2 delivers simple, efficient,and intelligent
data management for highperformance,scale-out storage"
http://www.ibm.com/common/ssi/rep_ca/8/897/ENUS215-398/ENUS215-398.PDF

Frank Kraemer
IBM Consulting IT Specialist  / Client Technical Architect
Hechtsheimer Str. 2, 55131 Mainz
mailto:kraemerf at de.ibm.com
voice: +49171-3043699
IBM Germany
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/ac6818eb/attachment-0001.htm>

From jonathan at buzzard.me.uk  Tue Oct 27 10:47:43 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 27 Oct 2015 10:47:43 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
	<201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
Message-ID: <1445942863.17909.89.camel@buzzard.phy.strath.ac.uk>

On Mon, 2015-10-26 at 14:22 -0800, Wayne Sawdon wrote:

[SNIP]

> 
> 
> > can you do a synthetic recreate of the TSM HSM tape from backups?
> 
> TSM stores data from backups and data from HSM in different pools. A
> file that is both HSM'ed and backed up will have at least two copies
> of data off-line. I suspect that losing a tape from the HSM pool will
> have no effect on the backup pool, but you should verify that with
> someone from TSM.
> 

I am pretty sure that you have to restore the files first from backup,
and it is a manual process. Least it was for me when a HSM tape went bad
in the past. Had to use TSM to generate a list of the files on the HSM
tape, and then feed that in to a dsmc restore, before doing a reconcile
and removing the tape from the library for destruction.

Finally all the files where punted back to tape.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From wsawdon at us.ibm.com  Tue Oct 27 15:25:02 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Tue, 27 Oct 2015 07:25:02 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562F21B7.8040007@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
Message-ID: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/4bcf33c9/attachment-0001.htm>

From S.J.Thompson at bham.ac.uk  Tue Oct 27 17:28:00 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Tue, 27 Oct 2015 17:28:00 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>

Hi,

If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.

>From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.

This appears to mean that quotas have to be double what we actually want to take account of the replication factor.

Is this correct?

Second part of the question. If a file is transferred to tape (or compressed maybe as well), does the file still count against quota, and how much for? As on hsm tape its no longer copies=2. Same for a compressed file, does the compressed file count as the original or compressed size against quota? I.e. Could a user accessing a compressed file suddenly go over quota by accessing the file?

Thanks

Simon

From Robert.Oesterlin at nuance.com  Tue Oct 27 19:48:04 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 27 Oct 2015 19:48:04 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <4E539EE4-596B-441C-9E60-46072E567765@nuance.com>

With Spectrum Scale 4.2 announced, can anyone from IBM comment on what the outlook/process is for fixes and PTFs?

When 4.1.1 came out, 4.1.0.X more or less dies, with 4.1.0.8 being the last level ? yes? Then move to 4.1.1
With 4.1.1 ? we are now at 4.1.1-2 and 4.2 is going to GA on 11/20/2015

Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/8ae9efa9/attachment-0001.htm>

From st.graf at fz-juelich.de  Wed Oct 28 08:06:01 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Wed, 28 Oct 2015 09:06:01 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
Message-ID: <563081E9.2090605@fz-juelich.de>

Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/a3c50597/attachment-0001.htm>

From Dan.Foster at bristol.ac.uk  Wed Oct 28 10:06:10 2015
From: Dan.Foster at bristol.ac.uk (Dan Foster)
Date: Wed, 28 Oct 2015 10:06:10 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double
our fileset quotas to take this is to account on replicated
filesystems.

You've reminded me that I was going to ask this list if it's possible
to report the un-replicated sizes? While the quota management is only
a slight pain, what's reported to the user is more of a problem for
us(e.g. via SMB share / df ). We're considering replicating a lot more
of our filesystems and it would be useful if it didn't appear that
everyones quotas had just doubled overnight.

Thanks,
Dan.
-- 
Dan Foster | Senior Storage Systems Administrator | IT Services


From duersch at us.ibm.com  Wed Oct 28 12:47:52 2015
From: duersch at us.ibm.com (Steve Duersch)
Date: Wed, 28 Oct 2015 08:47:52 -0400
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes
and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1
stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/b05c24f3/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Wed Oct 28 13:06:56 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 28 Oct 2015 13:06:56 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <B560C3B9-2F1B-4BC5-82E8-704D0E7121C6@nuance.com>

Hi Steve

Thanks ? that?s puzzling (surprising?) given that 4.1.1 hasn?t really been out that long. (less than 6 months)  I?m in a position of deciding of what my upgrade path and timeline should be. If I?m at 4.1.0.X and want to upgrade all my clusters, the ?safer? bet is probably 4.1.1-X. but all the new features are going to end up on the 4.2.X.

If 4.2 is going to GA in November, perhaps it?s better to wait for the first 4.2 PTF package.

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, October 28, 2015 at 7:47 AM
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

IBM will continue to create PTFs for the 4.1.1 stream.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/69dfc30c/attachment-0001.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Oct 28 13:09:52 2015
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 28 Oct 2015 13:09:52 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/c5067f1d/attachment-0001.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:15:30 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:15:30 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
	<6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6CF0@CHI-EXCHANGEW1.w2k.jumptrading.com>

IBM has stated that there will no longer be PTF releases for 4.1.0, and that 4.1.0-8 is the last PTF release.  Thus you?ll have to choose between upgrading to 4.1.1 (which has the latest GPFS Protocols feature, hence the numbering change), or wait and go with the 4.2 release.

I heard rumor from somebody at IBM (honestly can?t remember who) that the first 3 releases of any major release has some additional debugging turned up, which is turned off after on the fourth PTF release and those going forward.  Does anybody at IBM want to confirm or deny this rumor?

I?m also leery of going with the first major release of GPFS (or any software, like RHEL 7.0 for instance).

Thanks,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L
Sent: Wednesday, October 28, 2015 8:10 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:

>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2d765948/attachment-0001.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:25:27 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:25:27 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
	<CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6E05@CHI-EXCHANGEW1.w2k.jumptrading.com>

I'm not sure what kind of report you're looking for, but the `du` command has a "--apparent-size" option that has this description:
              print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (?sparse?) files, internal fragmentation, indirect blocks, and the like

This can be used to get the actual amount of space that files are using.

I think that mmrepquota and mmlsquota show twice the amount of space of the actual file due to the replication, but somebody correct me if I'm mistaken.

I also would like to know what the output of the ILM "LIST" policy reports for KB_ALLOCATED for replicated files.  Is it the replicated amount of data?

Thanks,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Dan Foster
Sent: Wednesday, October 28, 2015 5:06 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Quotas, replication and hsm

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double our fileset quotas to take this is to account on replicated filesystems.

You've reminded me that I was going to ask this list if it's possible to report the un-replicated sizes? While the quota management is only a slight pain, what's reported to the user is more of a problem for us(e.g. via SMB share / df ). We're considering replicating a lot more of our filesystems and it would be useful if it didn't appear that everyones quotas had just doubled overnight.

Thanks,
Dan.
--
Dan Foster | Senior Storage Systems Administrator | IT Services _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From wsawdon at us.ibm.com  Wed Oct 28 13:36:27 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 05:36:27 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <563081E9.2090605@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
	<563081E9.2090605@fz-juelich.de>
Message-ID: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>


You have to use both options even if -N is only the local node. Sorry,

-Wayne


From:	Stephan Graf <st.graf at fz-juelich.de>
To:	<gpfsug-discuss at spectrumscale.org>
Date:	10/28/2015 01:06 AM
Subject:	Re: [gpfsug-discuss] ILM and Backup Question
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use
the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:


      > From: Stephan Graf <st.graf at fz-juelich.de>

      > We are running the mmbackup on an AIX system
      > oslevel -s
      > 6100-07-10-1415
      > Current GPFS build: "4.1.0.8 ".
      >
      > So we only use one node for the policy run.
      >

      Even on one node you should see a speedup using -g and -N.

      -Wayne


      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at spectrumscale.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment-0001.gif>

From wsawdon at us.ibm.com  Wed Oct 28 14:11:25 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 06:11:25 -0800
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <201510281412.t9SEChQo030691@d01av03.pok.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
> <S.J.Thompson at bham.ac.uk>
>
> Second part of the question. If a file is transferred to tape (or
> compressed maybe as well), does the file still count against quota,
> and how much for? As on hsm tape its no longer copies=2. Same for a
> compressed file, does the compressed file count as the original or
> compressed size against quota? I.e. Could a user accessing a
> compressed file suddenly go over quota by accessing the file?
>

Quotas account for space in the file system. If you migrate a user's file
to tape, then that user is credited for the space saved. If a later access
recalls the file then the user is again charged for the space. Note that
HSM recall is done as "root" which bypasses the quota check -- this allows
the file to be recalled even if it pushes the user past his quota limit.

Compression (which is currently in beta) has the same properties. If you
compress a file, then the user is credited with the space saved. When the
file is uncompressed the user is again charged. Since uncompression is done
by the "user" the quota check is enforced and uncompression can fail. This
includes writes to a compressed file.


> From: Bryan Banister <bbanister at jumptrading.com>
>
> I also would like to know what the output of the ILM "LIST" policy
> reports for KB_ALLOCATED for replicated files.  Is it the replicated
> amount of data?
>
KB_ALLOCATED shows the same value that stat shows, So yes it shows the
replicated amount of data actually used by the file.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/f7c66e9b/attachment-0001.htm>

From makaplan at us.ibm.com  Wed Oct 28 14:48:11 2015
From: makaplan at us.ibm.com (makaplan at us.ibm.com)
Date: Wed, 28 Oct 2015 09:48:11 -0500
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>	<563081E9.2090605@fz-juelich.de>
	<201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
Message-ID: <201510281448.t9SEmFsr030044@d01av02.pok.ibm.com>

IF you see one or more status messages like this:

[I] %2$s Parallel-piped sort and policy evaluation. %1$llu files scanned. 
%3$s

Then you are getting the (potentially) fastest version of the GPFS inode 
and policy scanning algorithm.
You may also want to adjust the -a and -A options of the mmapplypolicy 
command, as mentioned in the command documentation.

Oh I see the documentation for -A is wrong in many versions of the manual. 
 There is an attempt to automagically estimate the proper number of
buckets, based on the inodes allocated count.  If you want to investigate 
performance more I recommend you use our debug option -d 7 or set 
environment
variable MM_POLICY_DEBUG_BITS=7 - this will show you how the work is 
divided among the nodes and threads.

--marc of GPFS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2dc8f229/attachment-0001.htm>

From knop at us.ibm.com  Thu Oct 29 14:14:58 2015
From: knop at us.ibm.com (Felipe Knop)
Date: Thu, 29 Oct 2015 09:14:58 -0500
Subject: [gpfsug-discuss] Intro (new member)
Message-ID: <OFEF991404.2B0D7580-ON85257EED.004CBAEA-85257EED.004E448D@notes.na.collabserv.com>

Hi,

I have just joined the GPFS (Spectrum Scale) UG list. I work in the GPFS 
development team.

I had the chance of attending the "Inaugural USA Meet the Devs" session in 
New York City on Oct 7, which was a valuable opportunity to hear from 
customers using the product.

  Felipe

----
Felipe Knop                                     knop at us.ibm.com
GPFS Development
IBM Systems
IBM Building 008
2455 South Rd, Poughkeepsie, NY 12601
(845) 433-9314  T/L 293-9314


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151029/fbb17b8a/attachment-0001.htm>

From carlz at us.ibm.com  Fri Oct 30 15:14:50 2015
From: carlz at us.ibm.com (Carl Zetie)
Date: Fri, 30 Oct 2015 10:14:50 -0500
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
Message-ID: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>

First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/15f2a056/attachment-0001.htm>

From jfhamano at us.ibm.com  Fri Oct 30 15:29:58 2015
From: jfhamano at us.ibm.com (John Hamano)
Date: Fri, 30 Oct 2015 07:29:58 -0800
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
In-Reply-To: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
References: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
Message-ID: <201510301530.t9UFUM0M004729@d03av05.boulder.ibm.com>

Hi Carl, welcome and congratulations on your new role.  I am North America 
Brand Sales for ESS and Spectrum Scale.   Let me know when you have some 
time next weekg to talk.


From:   Carl Zetie/Fairfax/IBM at IBMUS
To:     gpfsug-discuss at spectrumscale.org, 
Date:   10/30/2015 08:20 AM
Subject:        [gpfsug-discuss] Making an RFE Public (and an intro)
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/4527c35a/attachment-0001.htm>

From PATBYRNE at uk.ibm.com  Thu Oct  1 11:09:29 2015
From: PATBYRNE at uk.ibm.com (Patrick Byrne)
Date: Thu, 1 Oct 2015 10:09:29 +0000
Subject: [gpfsug-discuss] Problem Determination
Message-ID: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/6e900b95/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Thu Oct  1 13:39:25 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Thu, 1 Oct 2015 12:39:25 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151001/856fd025/attachment-0002.htm>

From bbanister at jumptrading.com  Fri Oct  2 17:44:24 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Fri, 2 Oct 2015 16:44:24 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05C8CE44@CHI-EXCHANGEW1.w2k.jumptrading.com>

I would like to strongly echo what Bob has stated, especially the documentation or wrong documentation, and I have in-lining some comments below.

I liken GPFS to a critical care patient at the hospital.  You have to check on the state regularly, know the running heart rate (e.g. waiters), the response of every component from disk, to networks, to server load, etc.  When a problem occurs, running tests (such as nsdperf)  to help isolate the problem quickly is crucial.  Capturing GPFS trace data is also very important if the problem isn?t obvious.  But then you have to wait for IBM support to parse the information and give you their analysis of the situation.  It would be great to get an advanced troubleshooting document that describes how to read the output of `mmfsadm dump` commands and the GPFS trace report that is generated.

Cheers,
-Bryan

From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-bounces at gpfsug.org] On Behalf Of Oesterlin, Robert
Sent: Thursday, October 01, 2015 7:39 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
                [Bryan: Also please, please provide a way to check whether or not the configuration parameters need to be changed.  I assume that there is a `mmfsadm dump` command that can tell you whether the config parameter needs to be changed, if not make one!  Just stating something like ?This could be increased to XX value for very large clusters? is not very helpful.

- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
                [Bryan: I know that Scott Fadden is a busy man, so I would recommend helping distribute the workload of maintaining the wiki documentation.  This data should be reviewed on a more regular basis, at least once for each major release I  would hope, and updated or deleted if found to be out of date.]

- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
                [Bryan: From what I?ve heard, IBM is actively working to make the deadlock amelioration logic better.  I agree that firing off traces can cause more problems, and we have turned off the automated collection as well.  We are going to work on enabling the collection of some data during these events to help ensure we get enough data for IBM to analyze the problem.]

- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.
                [Bryan: The GPFS callback facilities are very useful for setting up alerts, but not well documented or advertised by the GPFS manuals.  I hope to see more callback capabilities added to help monitor all aspects of the GPFS cluster and file systems]


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/d47c9d15/attachment-0002.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct  2 17:58:41 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 2 Oct 2015 16:58:41 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>,
	<F3CD5F73-26E6-4DBE-8422-A8D33956083F@nuance.com>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545C7BCB331@EX10.adf.bham.ac.uk>


I agree on docs, particularly on mmdiag, I think things like --lroc are not documented. I'm also not sure that --network always gives accurate network stats. (we were doing some ha failure testing where we have split site in and fabrics, yet the network counters didn't change even when the local ib nsd servers were shut down).

It would be nice also to have a set of Icinga/Nagios plugins from IBM, maybe in samples whcich are updated on each release with new feature checks.

And not problem determination, but id really like to see an inflight non disruptive upgrade path. Particularly as we run vms off gpfs, its bot always practical or possible to move vms, so would be nice to have upgrade in flight (not suggesting this would be a quick thing to implement).

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Oesterlin, Robert [Robert.Oesterlin at nuance.com]
Sent: 01 October 2015 13:39
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Problem Determination

Hi Patrick

I was going to mail you directly ? but this may help spark some discussion in this area.  GPFS (pardon the use of the ?old school" term ? You need something easier to type that Spectrum Scale) problem determination is one of those areas that is (sometimes) more of an art than a science. IBM publishes a PD guide, and it?s a good start but doesn?t cover all the bases.

- In the GPFS log (/var/mmfs/gen/mmfslog) there are a lot of messages generated. I continue to come across ones that are not documented ? or documented poorly. EVERYTHING that ends up in ANY log needs to be documented.
- The PD guide gives some basic things to look at for many of the error messages, but doesn?t go into alternative explanation for many errors. Example: When a node gets expelled, the PD guide tells you it?s a communication issue, when it fact in may be related to other things like Linux network tuning. Covering all the possible causes is hard, but you can improve this.
- GPFS waiter information ? understanding and analyzing this is key to getting to the bottom of many problems. The waiter information is not well documented. You should include at least a basic guide on how to use waiter information in determining cluster problems. Related: Undocumented config options. You can come across some by doing ?mmdiag ?config?. Using some of these can help you ? or get you in trouble in the long run. If I can see the option, document it.
- Make sure that all information I might come across online is accurate, especially on those sites managed by IBM. The Developerworks wiki has great information, but there is a lot of information out there that?s out of date or inaccurate. This leads to confusion.
- The automatic deadlock detection implemented in 4.1 can be useful, but it also can be problematic in a large cluster when you get into problems. Firing off traces and taking dumps in an automated manner  can cause more problems if you have a large cluster. I ended up turning it off.
- GPFS doesn?t have anything setup to alert you when conditions occur that may require your attention. There are some alerting capabilities that you can customize, but something out of the box might be useful. I know there is work going on in this area.


mmces ? I did some early testing on this but haven?t had a chance to upgrade my protocol nodes to the new level. Upgrading 1000?s of node across many cluster is ? challenging :-) The newer commands are a great start. I like the ability to list out events related to a particular protocol.

I could go on? Feel free to contact me directly for a more detailed discussion: robert.oesterlin @ nuance.com

Bob Oesterlin
Sr Storage Engineer, Nuance Communications

From: <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> on behalf of Patrick Byrne
Reply-To: gpfsug main discussion list
Date: Thursday, October 1, 2015 at 5:09 AM
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>"
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com<mailto:patbyrne at uk.ibm.com>).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited


From ewahl at osc.edu  Fri Oct  2 19:00:46 2015
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 2 Oct 2015 18:00:46 +0000
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
Message-ID: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>

I'm not yet in the 4.x release stream so this may be taken with a grain (or more) of salt as we say.

PLEASE keep the ability of commands to set -x or dump debug when the env DEBUG=1 is set.  This has been extremely useful over the years.   Granted I've never worked out why sometimes we see odd little  things like machines deciding they suddenly need an FPO license or one nsd server suddenly decides it's name is part of the FQDN instead of just it's hostname and only for certain commands, but it's DAMN useful.  Minor issues especially can be tracked down with it.

Undocumented features and logged items abound.  I'd say start there.  This is one area where it is definitely more art than science with Spectrum Scale (meh GPFS still sounds better. So does Shark. Can we go back to calling it the Shark Server Project?)

  Complete failure of the verbs layer and fallback to other defined networks would be nice to know about during operation. It's excellent about telling you at startup but not so much during operation, at least in 3.5.

 I imagine with the 'automated compatibility layer building' I'll be looking for some serious amounts of PD for the issues we _will_ see there.  We frequently build against kernels we are not yet running at this site, so this needs well documented PD and resolution.

Ed Wahl
OSC


________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
Sent: Thursday, October 01, 2015 6:09 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/5fa79ef0/attachment-0002.htm>

From zgiles at gmail.com  Fri Oct  2 21:27:17 2015
From: zgiles at gmail.com (Zachary Giles)
Date: Fri, 2 Oct 2015 16:27:17 -0400
Subject: [gpfsug-discuss] Problem Determination
In-Reply-To: <9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
References: <201510011010.t91AASm6029240@d06av08.portsmouth.uk.ibm.com>
	<9DA9EC7A281AC7428A9618AFDC49049955AEB4DF@CIO-KRC-D1MBX02.osuad.osu.edu>
Message-ID: <CAMYZk=dnEN_uuAuC4Pk3AzmrPD22mX4ssVSqnKrJ1N1mKUXgpQ@mail.gmail.com>

I would like to see better performance metrics / counters from GPFS.
I know we already have mmpmon, which is generally really good -- I've
done some fun things with it and it has been a great tool. And, I
realize that there is supposedly a new monitoring framework in 4.x..
which I haven't played with yet.

But,
Generally it would be extremely helpful to get synchronized (across
all nodes) high accuracy counters of data flow, number of waiters,
page pool stats, distribution of data from one layer to another down
to NSDs.. etc etc etc.  I believe many of these counters already
exist, but they're hidden in some mmfsadm xx command that one needs to
troll through with possible performance implications. mmpmon can do
some of this, but it's only a handful of counters, it's hard to say
how synchronized the counters are across nodes, and I've personally
seen an mmpmon run go bad and take down a cluster.  It would be nice
if it were pushed out, or provided in a safe manner with the design
and expectation of "log-everything forever continuously".

As GSS/ESS systems start popping up, I realize they have this other
monitoring framework to watch the VD throughputs.. which is great.
But, that doesn't allow us to monitor more traditional types.
Would be nice to monitor it all together the same way so we don't
miss-out on monitoring half the infrastructure or buying a cluster
with some fancy GUI that can't do what we want..

-Zach


On Fri, Oct 2, 2015 at 2:00 PM, Wahl, Edward <ewahl at osc.edu> wrote:
> I'm not yet in the 4.x release stream so this may be taken with a grain (or
> more) of salt as we say.
>
> PLEASE keep the ability of commands to set -x or dump debug when the env
> DEBUG=1 is set.  This has been extremely useful over the years.   Granted
> I've never worked out why sometimes we see odd little  things like machines
> deciding they suddenly need an FPO license or one nsd server suddenly
> decides it's name is part of the FQDN instead of just it's hostname and only
> for certain commands, but it's DAMN useful.  Minor issues especially can be
> tracked down with it.
>
> Undocumented features and logged items abound.  I'd say start there.  This
> is one area where it is definitely more art than science with Spectrum Scale
> (meh GPFS still sounds better. So does Shark. Can we go back to calling it
> the Shark Server Project?)
>
>   Complete failure of the verbs layer and fallback to other defined networks
> would be nice to know about during operation. It's excellent about telling
> you at startup but not so much during operation, at least in 3.5.
>
>  I imagine with the 'automated compatibility layer building' I'll be looking
> for some serious amounts of PD for the issues we _will_ see there.  We
> frequently build against kernels we are not yet running at this site, so
> this needs well documented PD and resolution.
>
> Ed Wahl
> OSC
>
>
> ________________________________
> From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org]
> on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
> Sent: Thursday, October 01, 2015 6:09 AM
> To: gpfsug-discuss at gpfsug.org
> Subject: [gpfsug-discuss] Problem Determination
>
> Hi all,
>
> As I'm sure some of you aware, problem determination is an area that we are
> looking to try and make significant improvements to over the coming releases
> of Spectrum Scale. To help us target the areas we work to improve and make
> it as useful as possible I am trying to get as much feedback as I can about
> different problems users have, and how people go about solving them.
>
> I am interested in hearing everything from day to day annoyances to problems
> that have caused major frustration in trying to track down the root cause.
> Where possible it would be great to hear how the problems were dealt with as
> well, so that others can benefit from your experience. Feel free to reply to
> the mailing list - maybe others have seen similar problems and could provide
> tips for the future - or to me directly if you'd prefer
> (patbyrne at uk.ibm.com).
>
> On a related note, in 4.1.1 there was a component added that monitors the
> state of the various protocols that are now supported (NFS, SMB, Object).
> The output from this is available with the 'mmces state' and 'mmces events'
> CLIs and I would like to get feedback from anyone who has had the chance
> make use of this. Is it useful? How could it be improved? We are looking at
> the possibility of extending this component to cover more than just
> protocols, so any feedback would be greatly appreciated.
>
> Thanks in advance,
>
> Patrick Byrne
> IBM Spectrum Scale - Development Engineer
> IBM Systems - Manchester Lab
> IBM UK Limited
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
Zach Giles
zgiles at gmail.com


From Luke.Raimbach at crick.ac.uk  Mon Oct  5 13:57:14 2015
From: Luke.Raimbach at crick.ac.uk (Luke Raimbach)
Date: Mon, 5 Oct 2015 12:57:14 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <AMXPR03MB0216D4B7DBCDE69469121EFB0480@AMXPR03MB021.eurprd03.prod.outlook.com>

Hi All,

When creating an independent inode space, I see the valid range for the number of inodes is between 1024 and 4294967294.

Is the ~4.2billion upper limit something that can be increased in the future?

I also see that the first 1024 inodes are immediately allocated upon creation. I assume these are allocated to internal data structures and are a copy of a subset of the first 4038 inodes allocated for new file systems? It would be useful to know if these internal structures are fixed for independent filesets and if they are not, what factors determine their layout (for performance purposes).

Many Thanks,
Luke.

Luke Raimbach?
Senior HPC Data and Storage Systems Engineer,
The Francis Crick Institute,
Gibbs Building,
215 Euston Road,
London NW1 2BE.

E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.

From usa-principal at gpfsug.org  Mon Oct  5 14:55:15 2015
From: usa-principal at gpfsug.org (usa-principal-gpfsug.org)
Date: Mon, 05 Oct 2015 09:55:15 -0400
Subject: [gpfsug-discuss] Final Reminder: Inaugural US "Meet the Developers"
Message-ID: <9656d0110c2be4b339ec5ce662409b8e@webmail.gpfsug.org>

A last reminder to check in with Janet if you have not done so already. 
Looking forward to this event on Wednesday this week.

Best,
Kristy

---

Hello Everyone,

Here is a reminder about our inaugural US "Meet the Developers" session. 
  Details are below, and please send an e-mail to Janet Ellsworth 
(janetell at us.ibm.com) by next Friday September 18th if you wish to 
attend. Janet is on the product management team for Spectrum Scale and 
is helping with the logistics for this first event.

Date:  Wednesday, October 7th
Place: IBM building at 590 Madison Avenue, New York City
Time:  12:30 to 5 PM (Lunch will be served at 12:30, and sessions will 
start between 1 and 1:30 PM.  Afternoon snacks will be served as well 
:-)

Agenda
IBM development architect to present the new protocols support that was 
released with Spectrum Scale 4.1.1 in June.
IBM developer to demo future Graphical User Interface
***Member of user community to present an experience with using Spectrum 
Scale (still seeking volunteers for this !)***
Open Q&A with the development team

We are happy to have heard from many of you so far who would like to 
attend.   We still have room however, so please get in touch by the 9/18 
date if you would like to attend.

***We also need someone to share an experience or use case scenario with 
Spectrum Scale for this event, so please let Janet know if you are 
willing to do that too.***

As you have likely seen, we are also working on the agenda and timing 
for day-long GPFS US UG event in Austin during November aligned with 
SC15 and there will be more details on that coming soon.


From secretary at gpfsug.org  Wed Oct  7 12:50:51 2015
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Wed, 07 Oct 2015 12:50:51 +0100
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <813d82bd5074b90c3a67acc85a03995b@webmail.gpfsug.org>

Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org


From service at metamodul.com  Wed Oct  7 16:06:56 2015
From: service at metamodul.com (service at metamodul.com)
Date: Wed, 07 Oct 2015 17:06:56 +0200
Subject: [gpfsug-discuss] Places available: Meet the Devs
Message-ID: <me1pxc2iifvcntr6uhjdasup.1444230416911@email.android.com>

Hi Claire,
I will attend the meeting.
Hans-Joachim Ehlers
MetaModul GmbH
Germany

Cheers
Hajo
Von Samsung Mobile gesendet

<div>-------- Urspr?ngliche Nachricht --------</div><div>Von: Secretary GPFS UG <secretary at gpfsug.org> </div><div>Datum:2015.10.07  13:50  (GMT+01:00) </div><div>An: gpfsug main discussion list <gpfsug-discuss at gpfsug.org> </div><div>Betreff: [gpfsug-discuss] Places available: Meet the Devs </div><div>
</div>Hi All,

There are still places available for the next 'Meet the Devs' event in 
Edinburgh on Friday 23rd October from 10:30/11am until 3/3:30pm. It's a 
great opportunity for you to meet with developers and talk through 
specific issues as well as learn more from the experts.

Location: Room 2009a, Information Services, James Clerk Maxwell 
Building, Peter Guthrie Tait Road, Edinburgh EH9 3FD
Google maps link:
https://goo.gl/maps/Ta7DQ

Agenda:
- GUI
- 4.2 Updates/show and tell
- Open conversation on any areas of interest attendees may have

Lunch and refreshments will be provided.

Please email me (secretary at gpfsug.org) if you would like to attend 
including any particular topics of interest you would like to discuss.

Best wishes,

-- 
Claire O'Toole
GPFS User Group Secretary
+44 (0)7508 033896
www.gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/53b57a09/attachment-0002.htm>

From Douglas.Hughes at DEShawResearch.com  Wed Oct  7 19:59:26 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Wed, 7 Oct 2015 18:59:26 +0000
Subject: [gpfsug-discuss] new member, first post
Message-ID: <FE83190899DE5B41BD7B5C3FD84CDF8F6199300D@mailnycmb5a.winmail.deshaw.com>


sitting here in the US GPFS UG meeting in NYC and just found out about this list.

We've been a GPFS user for many years, first with integrated DDN support, but now also with a GSS system. we have about 4PB of raw GPFS storage and 1 billion inodes. We keep our metadata on TMS ramsan for very fast policy execution for tiering and migration.

We use GPFS to hold the primary source data from our custom supercomputers. We have many policies executed periodically for managing the data, including writing certain files to dedicated fast pools and then migrating the data off to wide swaths of disk for read access from cluster clients.

One pain point, which I'm sure many of the rest of you have seen, restripe operations for just metadata are unnecessarily slow. If we experience a flash module failure and need to restripe, it also has to check all of the data. I have a feature request open to make metadata restripes only look at metadata (since it is on RamSan/FlashCache, this should be very fast) instead of scanning everything, which can and does take months with performance impacts.

Doug Hughes
D. E. Shaw Research, LLC.

Sent from my android device.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151007/77b16dfa/attachment-0002.htm>

From chair at gpfsug.org  Thu Oct  8 20:37:05 2015
From: chair at gpfsug.org (GPFS UG Chair (Simon Thompson))
Date: Thu, 08 Oct 2015 20:37:05 +0100
Subject: [gpfsug-discuss] User group update
Message-ID: <D23C8471.1E4E7%chair@gpfsug.org>

Hi,

I thought I'd drop an update to the group on various admin things which
have been going on behind the scenes.

The first US meet the devs event was held yesterday, and I'm hoping
someone who went will be preparing a blog post to cover the event a
little. I know a bunch of people have joined the mailing list since then,
so welcome to the group to all of those!


** User Group Engagement with IBM **

I also met with Akhtar yesterday who is the IBM VP for Technical Computing
Developments (which includes Spectrum Scale). He was in the UK for a few
days at the IBM Manchester Labs, so we managed to squeeze a meeting to
talk a bit about the UG. I'm very pleased that Akhtar confirmed IBMs
commitment to help the user group in both the UK and USA with developer
support for the meet the devs and annual group meetings. I'd like to
extend my thanks to those at IBM who are actively supporting the group in
so many ways.

One idea we have been mulling over is filming the talks at next year's
events and then putting those on Youtube for people who can't get there.
IBM have given us tentative agreement to do this, subject to a few
conditions. Most importantly that the UG and IBM ensure we don't publish
customer or IBM items which are NDA/not for general public consumption.
I'm hopeful we can get this all approved and if we do, we'll be looking to
the community to help us out (anyone got digital camera equipment we might
be able to borrow, or some help with editing down afterwards?)

Whilst in Manchester I also met with Patrick to talk over the various
emails people have sent in about problem determination, which Patrick will
be taking to the dev meeting in a few weeks. It sounds like there are some
interesting ideas kicking about, so hopefully we'll get some value from
the user group input.

Some of the new features in 4.2 were also demo'd and for those who might
not have been to a meet the devs session and are interested in the
upcoming GUI, it is now in public beta, head over to developer works for
more details:
  
https://www.ibm.com/developerworks/community/forums/html/topic?id=4dc34bf1-
17d1-4dc0-af72-6dc5a3f93e82&ps=25


** User Group Feedback **

Over the past few months, I've also been collecting feedback from people,
either comments on the mailing list, or those who I've spoken to, which
was all collated and sent in to IBM, we'll hopefully be getting some
feedback on that in the next few weeks - there's a bunch of preliminary
answers now, but a few places we still need a bit of clarification.

There's also some longer term discussion going on about GPFS and cloud (in
particular to those of us in scientific areas). We'll feed that back as
and when we get responses we can share.

We'd like to ensure that we gather as much feedback from users so that we
can collectively take it to IBM, so please do continue to post comments
etc to the mailing list.


** Diary Dates **

A few dates for diaries:
  * Meet the Devs in Edinburgh - Friday 23rd October 2015
  * GPFS UG Meeting @ SC15 in Austin, USA - Sunday 15th November 2015
  * GPFS UG Meeting @ Computing Insight UK, Coventry, UK - Tuesday 8th
December 2015 (Note you must be registered also for CIUK)
  * GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
2016


** User Group Admin **

Within the committee, we've been talking about how we can extend the reach
of the group, so we may be reaching out to a few group members to take
this forward. Of course if anyone has suggestions on how we can ensure we
reach as many people as possible, please let me know, either via the
mailing list of directly by email.
I know there are lot of people on the mailing list who don't post
(regularly), so I'd be interested to hear if you find the group mailing
list discussion useful, if you feel there are barriers to asking
questions, or what you'd like to see coming out of the user group - please
feel free to email me directly if you'd like to comment on any of this!


We've also registered spectrumscale.org to point to the user group, so you
may start to see the group marketed as the Spectrum Scale User Group, but
rest assured, its still the same old GPFS User Group ;-)


Just a reminder that we made the mailing list so that only members can
post. This was to reduce the amount of spam coming in and being held for
moderation (and a few legit posts got lost this way). If you do want to
post, but not receive the emails, you can set this as an option in the
mailing list software.

Finally, I've also fixed the mailing list archives, so these are now
available at:
  http://www.gpfsug.org/pipermail/gpfsug-discuss/

Simon
GPFS UG, UK Chair


From L.A.Hurst at bham.ac.uk  Fri Oct  9 09:25:52 2015
From: L.A.Hurst at bham.ac.uk (Laurence Alexander Hurst (IT Services))
Date: Fri, 9 Oct 2015 08:25:52 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23D3808.E75E%l.a.hurst@bham.ac.uk>

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
-- 
Laurence Hurst
Research Support, IT Services, University of Birmingham


From S.J.Thompson at bham.ac.uk  Fri Oct  9 10:00:11 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 9 Oct 2015 09:00:11 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
References: <D23D3808.E75E%l.a.hurst@bham.ac.uk>
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D166BF67@EX10.adf.bham.ac.uk>


Both days. May 2016 is a two day event.

Simon
________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Laurence Alexander Hurst (IT Services) [L.A.Hurst at bham.ac.uk]
Sent: 09 October 2015 09:25
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] User group update

On 08/10/2015 20:37, "gpfsug-discuss-bounces at gpfsug.org on behalf of GPFS
UG Chair (Simon Thompson)" <gpfsug-discuss-bounces at gpfsug.org on behalf of
chair at gpfsug.org> wrote:

>GPFS UG Meeting May 2015 - IBM South Bank, London, UK- 17th/18th May
>2016

Daft question: is that 17th *and* 18th or 17th *or* 18th (presumably TBC)?

Thanks,

Laurence
--
Laurence Hurst
Research Support, IT Services, University of Birmingham


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:54:22 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:54:22 +0000
Subject: [gpfsug-discuss] User group update
Message-ID: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>

>
>We've also registered spectrumscale.org to point to the user group, so you
>may start to see the group marketed as the Spectrum Scale User Group, but
>rest assured, its still the same old GPFS User Group ;-)

And this is just a test mail to ensure that mail to
gpfsug-discuss at spectrumscale.org gets through OK. The old address should
also still work.

Simon


From S.J.Thompson at bham.ac.uk  Sat Oct 10 14:55:55 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Sat, 10 Oct 2015 13:55:55 +0000
Subject: [gpfsug-discuss] User group update
In-Reply-To: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
References: <D23ED6CE.1E5A0%s.j.thompson@bham.ac.uk>
Message-ID: <D23ED768.1E5A5%s.j.thompson@bham.ac.uk>


On 10/10/2015 14:54, "Simon Thompson (Research Computing - IT Services)"
<S.J.Thompson at bham.ac.uk> wrote:

>>
>>We've also registered spectrumscale.org to point to the user group, so
>>you
>>may start to see the group marketed as the Spectrum Scale User Group, but
>>rest assured, its still the same old GPFS User Group ;-)
>
>And this is just a test mail to ensure that mail to
>gpfsug-discuss at spectrumscale.org gets through OK. The old address should
>also still work.

And checking the old address still works fine as well.

Simon


From Robert.Oesterlin at nuance.com  Tue Oct 13 03:03:45 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 13 Oct 2015 02:03:45 +0000
Subject: [gpfsug-discuss] User group Meeting at SC15 - Registration
Message-ID: <F1459A73-6A82-4935-8518-C3F88DF2C794@nuance.com>

We?d like to have all those attending the user group meeting at SC15 to register ? details are below. Thanks to IBM for getting the space and arranging all the details. I?ll post a more detailed agenda soon.

Looking forward to meeting everyone!

Location:
JW Marriott
110 E 2nd Street
Austin, Texas
United States

Date and Time:
Sunday Nov 15, 1:00 PM?5:30 PM

Agenda:

- Latest IBM Spectrum Scale enhancements
- Future directions and roadmap* (NDA required)
- Newer usecases and User presentations

Registration:
Please register at the below link to book your seat. <https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales>
https://www-950.ibm.com/events/wwe/grp/grp017.nsf/v17_agenda?openform&seminar=99QNTNES&locale=en_US&S_TACT=sales


Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151013/23eca25e/attachment-0002.htm>

From chair at spectrumscale.org  Sat Oct 17 20:51:50 2015
From: chair at spectrumscale.org (GPFS UG Chair (Simon Thompson))
Date: Sat, 17 Oct 2015 20:51:50 +0100
Subject: [gpfsug-discuss] Blog on USA Meet the Devs
Message-ID: <D2486566.1EFE5%chair@spectrumscale.org>

Hi All,

Kirsty wrote a blog post on the inaugural meet the devs in the USA. You
can find it here:

http://www.spectrumscale.org/inaugural-usa-meet-the-devs/

Thanks to Kristy, Bob and Pallavi for organising, the IBM devs and the
group members giving talks.

Simon


From Tomasz.Wolski at ts.fujitsu.com  Wed Oct 21 15:23:54 2015
From: Tomasz.Wolski at ts.fujitsu.com (Wolski, Tomasz)
Date: Wed, 21 Oct 2015 16:23:54 +0200
Subject: [gpfsug-discuss] Intro
Message-ID: <C0D8634A5A8F1644A0EF57B6A6449FE74011920576@ABGEX74E.FSC.NET>

Hi All,

My name is Tomasz Wolski and I?m development engineer at Fujitsu Technology Solutions in Lodz, Poland. We?ve been using GPFS in our main product, which is ETERNUS CS8000, for many years now. GPFS helps us to build a consolidation of backup and archiving solutions for our end customers. We make use of GPFS snapshots, NIFS/CIFS services, GPFS API for our internal components and many many more .. :)

My main responsibility, except developing new features for our system, is integration new GPFS versions into our system and bug tracking GPFS issues.

Best regards,
Tomasz Wolski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151021/e2fab30a/attachment-0002.htm>

From S.J.Thompson at bham.ac.uk  Fri Oct 23 15:04:49 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Fri, 23 Oct 2015 14:04:49 +0000
Subject: [gpfsug-discuss] Independent Inode Space Limit
Message-ID: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon


From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/7357bf43/attachment-0002.gif>

From sfadden at us.ibm.com  Fri Oct 23 13:42:14 2015
From: sfadden at us.ibm.com (Scott Fadden)
Date: Fri, 23 Oct 2015 07:42:14 -0500
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
Message-ID: <201510231442.t9NEgQ0M024262@d01av05.pok.ibm.com>


GPFS limits the max inodes based on metadata space. Add more metadata space
and you should be able to add more inodes.


Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale


From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Date:	10/23/2015 09:05 AM
Subject:	Re: [gpfsug-discuss] Independent Inode Space Limit
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


>When creating an independent inode space, I see the valid range for the
>number of inodes is between 1024 and 4294967294.
>
>Is the ~4.2billion upper limit something that can be increased in the
>future?
>
>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Anyone have any thoughts on this? Anyone from IBM know?

Thanks

Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/00f1263a/attachment-0002.gif>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPr1G010768@d03av04.boulder.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/156bde83/attachment-0002.htm>

From wsawdon at us.ibm.com  Fri Oct 23 16:25:33 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Fri, 23 Oct 2015 08:25:33 -0700
Subject: [gpfsug-discuss] Independent Inode Space Limit
In-Reply-To: <201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
References: <D24FFCEF.1F56D%s.j.thompson@bham.ac.uk>
	<201510231442.t9NEgt6b026687@d03av05.boulder.ibm.com>
Message-ID: <201510231525.t9NFPv9P004320@d01av03.pok.ibm.com>


>I also see that the first 1024 inodes are immediately allocated upon
>creation. I assume these are allocated to internal data structures and
>are a copy of a subset of the first 4038 inodes allocated for new file
>systems? It would be useful to know if these internal structures are
>fixed for independent filesets and if they are not, what factors
>determine their layout (for performance purposes).

Independent filesets don't have the internal structures that the file
system has. Other than the fileset's root directory all of the remaining
inodes can be allocated to user files.  Inodes are always allocated in full
metadata blocks. The inodes for an independent fileset are allocated  in
their own blocks. This makes fileset snapshots more efficient, since a
copy-on-write of the block of inodes will only copy inodes in the fileset.
The inode blocks for all filesets are in the same inode file, but the
blocks for each independent fileset are strided, making them easy to
prefetch for policy scans.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151023/c4ff607f/attachment-0002.htm>

From kallbac at iu.edu  Mon Oct 26 02:38:52 2015
From: kallbac at iu.edu (Kallback-Rose, Kristy A)
Date: Sun, 25 Oct 2015 22:38:52 -0400
Subject: [gpfsug-discuss] ILM and Backup Question
Message-ID: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151025/7acc530c/attachment-0002.sig>

From st.graf at fz-juelich.de  Mon Oct 26 08:43:33 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Mon, 26 Oct 2015 09:43:33 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <562DE7B5.7080303@fz-juelich.de>

Hi!

We at J?lich Supercomputing Centre have two ILM managed file systems  (GPFS and HSM from TSM).
    #50 mio files + 10 PB data on tape
    #30 mio files + 8 PB data on tape

For backup we use mmbackup (dsmc)
    for the user HOME directory (no ILM)
    #120 mio files => 3 hours get candidate list + x hour backup

We use also mmbackup for the ILM managed filesystem.
    Policy: the file must be backed up first before migrated to tape
    2-3 hour for candidate list + x hours/days/weeks backups (!!!)
    -> a metadata change (e.g. renaming a directory by the user) enforces a new backup of the files which causes a very expensive tape inline copy!

Greetings from J?lich, Germany
Stephan

On 10/26/15 03:38, Kallback-Rose, Kristy A wrote:

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/85e961b4/attachment-0002.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0004.htm>

From Douglas.Hughes at DEShawResearch.com  Mon Oct 26 13:42:47 2015
From: Douglas.Hughes at DEShawResearch.com (Hughes, Doug)
Date: Mon, 26 Oct 2015 13:42:47 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <ca10488de1df4fae816673968e5ab7ce@mbxtoa3.winmail.deshaw.com>

We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.


Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!?

Yes, please! I?m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration?we have had HPSS for a very long time), but I?m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that?s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0005.htm>

From S.J.Thompson at bham.ac.uk  Mon Oct 26 20:15:26 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Mon, 26 Oct 2015 20:15:26 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
Message-ID: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>

Hi Kristy,

Yes thanks for picking this up.

So we (UoB) have 3 GPFS environments, each with different approaches.

1. OpenStack (GPFS as infrastructure) - we don't back this up at all.
Partly this is because we are still in pilot phase, and partly because we
also have ~7PB CEPH over 4 sites for this project, and the longer term aim
is for us to ensure data sets and important VM images are copied into the
CEPH store (and then replicated to at least 1 other site).

We have some challenges with this, how should we do this? We're sorta
thinging about maybe going down the irods route for this, policy scan the
FS maybe, add xattr onto important data, and use that to get irods to send
copies into CEPH (somehow). So this would be a bit of a hybrid home-grown
solution going on here. Anyone got suggestions about how to approach this?
I know IBM are now an irods consortium member, so any magic coming from
IBM to integrate GFPS and irods?


2. HPC. We differentiate on our HPC file-system between backed up and non
backed up space. Mostly its non backed up, where we encourage users to
keep scratch data sets. We provide a small(ish) home directory which is
backed up with TSM to tape, and also backup applications and system
configs of the system. We use a bunch of jobs to sync some configs into
local git which also is stored in the backed up part of the FS, so things
like switch configs, icinga config can be backed up sanely.


3. Research Data Storage. This is a large bulk data storage solution. So
far its not actually that large (few hundred TB), so we take the
traditional TSM back to tape approach (its also sync replicated between
data centres). We're already starting to see some possible slowness on
this with data ingest and we've only just launched the service. Maybe that
is a cause of launching that we suddenly see high data ingest. We are also
experimenting with HSM to tape, but other than that we have no other ILM
policies - only two tiers of disk, SAS for metadata and NL-SAS for bulk
data. I'd like to see a flash tier in there for Metadata, which would free
SAS drives and so we might be more into ILM policies. We have some more
testing with snapshots to do, and have some questions about recovery of
HSM files if the FS is snapshotted. Anyone any experience with this with
4.1 upwards versions of GPFS? Straight TSM backup for us means we can end
up with 6 copies of data - once per data centre, backup + offsite backup
tape set, HSM pool + offsite copy of HSM pool. (If an HSM tape fails, how
do we know what to restore from backup? Hence we make copies of the HSM
tapes as well).


As our backups run on TSM, it uses the policy engine and mmbackup, so we
only backup changes and new files, and never backup twice from the FS.

Does anyone know how TSM backups handle XATTRs? This is one of the
questions that was raised at meet the devs. Or even other attributes like
immutability, as unless you are in complaint mode, its possible for
immutable files to be deleted in some cases. In fact this is an
interesting topic, it just occurred to me, what happens if your HSM tape
fails and it contained immutable files. Would it be possible to recover
these files if you don't have a copy of the HSM tape? - can you do a
synthetic recreate of the TSM HSM tape from backups?


We typically tell users that backups are for DR purposes, but that we'll
make efforts to try and restore files subject to resource availability.

Is anyone using SOBAR? What is your rationale for this? I can see that at
scale, there are lot of benefits to this. But how do you handle users
corrupting/deleting files etc? My understanding of SOBAR is that it
doesn't give you the same ability to recover versions of files, deletions
etc that straight TSM backup does. (this is something I've been meaning to
raise for a while here).


So what do others do? Do you have similar approaches to not backing up
some types of data/areas? Do you use TSM or home-grown solutions? Or even
other commercial backup solutions? What are your rationales for making
decisions on backup approaches? Has anyone built their own DMAPI type
interface for doing these sorts of things? Snapshots only? Do you allow
users to restore themselves? If you are using ILM, are you doing it with
straight policy, or is TSM playing part of the game?

(If people want to comment anonymously on this without committing their
company on list, happy to take email to the chair@ address and forward on
anonymously to the group).

Simon

On 26/10/2015, 02:38, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Kallback-Rose, Kristy A" <gpfsug-discuss-bounces at spectrumscale.org on
behalf of kallbac at iu.edu> wrote:

>Simon wrote recently in the GPFS UG Blog: "We also got into discussion on
>backup and ILM, and I think its amazing how everyone does these things in
>their own slightly different way. I think this might be an interesting
>area for discussion over on the group mailing list. There's a lot of
>options and different ways to do things!?
>
>Yes, please! I?m *very* interested in what others are doing.
>
>We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS
>Integration?we have had HPSS for a very long time), but I?m interested
>what others are doing with either ILM or other methods to brew their own
>backup solutions, how much they are backing up and with what regularity,
>what resources it takes, etc.
>
>If you have anything going on at your site that?s relevant, can you
>please share?
>
>Thanks,
>Kristy
>
>Kristy Kallback-Rose
>Manager, Research Storage
>Indiana University


From wsawdon at us.ibm.com  Mon Oct 26 21:12:55 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 13:12:55 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562DE7B5.7080303@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
Message-ID: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many
nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by
default it still runs on a single node and you have to specify -N to run it
in parallel.  When you run multi-node there is a "-g" option that defines a
global shared directory that must be visible to all nodes specified in the
-N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that
substantially reduces the time for candidate selection.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/fb91c0e5/attachment-0002.htm>

From wsawdon at us.ibm.com  Mon Oct 26 22:22:58 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Mon, 26 Oct 2015 14:22:58 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <D2543527.1F6A0%s.j.thompson@bham.ac.uk>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
Message-ID: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
>
> Does anyone know how TSM backups handle XATTRs?

TSM capture XATTRs and ACLs in an opaque "blob" using gpfs_fgetattrs.
Unfortunately, TSM stores the opaque blob with the file data. Changes to
the blob require the data to be backed up again.


> Or even other attributes like immutability,

Immutable files may be backed up and restored as immutable files.
Immutability is restored after the data has been restored.


> can you do a synthetic recreate of the TSM HSM tape from backups?

TSM stores data from backups and data from HSM in different pools. A file
that is both HSM'ed and backed up will have at least two copies of data
off-line. I suspect that losing a tape from the HSM pool will have no
effect on the backup pool, but you should verify that with someone from
TSM.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/80d94761/attachment-0002.htm>

From st.graf at fz-juelich.de  Tue Oct 27 07:03:19 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Tue, 27 Oct 2015 08:03:19 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
Message-ID: <562F21B7.8040007@fz-juelich.de>

We are running the mmbackup on an AIX system
oslevel -s
6100-07-10-1415
Current GPFS build: "4.1.0.8 ".

So we only use one node for the policy run.

Stephan

On 10/26/15 22:12, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>
>
> For backup we use mmbackup (dsmc)
>     for the user HOME directory (no ILM)
>     #120 mio files => 3 hours get candidate list + x hour backup

That seems rather slow. What version of GPFS are you running? How many nodes are you using? Are you using a "-g global shared directory"?

The original mmapplypolicy code was targeted to a single node, so by default it still runs on a single node and you have to specify -N to run it in parallel.  When you run multi-node there is a "-g" option that defines a global shared directory that must be visible to all nodes specified in the -N list.  Using "-g" with "-N" enables a scale-out parallel algorithm that substantially reduces the time for candidate selection.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/92fb60be/attachment-0002.htm>

From kraemerf at de.ibm.com  Tue Oct 27 09:02:52 2015
From: kraemerf at de.ibm.com (Frank Kraemer)
Date: Tue, 27 Oct 2015 10:02:52 +0100
Subject: [gpfsug-discuss] Spectrum Scale v4.2
In-Reply-To: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
References: <mailman.1071.1445894071.7782.gpfsug-discuss@spectrumscale.org>
Message-ID: <201510270904.t9R940k4019623@d06av11.portsmouth.uk.ibm.com>

see "IBM Spectrum Scale V4.2 delivers simple, efficient,and intelligent
data management for highperformance,scale-out storage"
http://www.ibm.com/common/ssi/rep_ca/8/897/ENUS215-398/ENUS215-398.PDF

Frank Kraemer
IBM Consulting IT Specialist  / Client Technical Architect
Hechtsheimer Str. 2, 55131 Mainz
mailto:kraemerf at de.ibm.com
voice: +49171-3043699
IBM Germany
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/ac6818eb/attachment-0002.htm>

From jonathan at buzzard.me.uk  Tue Oct 27 10:47:43 2015
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 27 Oct 2015 10:47:43 +0000
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<D2543527.1F6A0%s.j.thompson@bham.ac.uk>
	<201510262224.t9QMOwRO006986@d03av03.boulder.ibm.com>
Message-ID: <1445942863.17909.89.camel@buzzard.phy.strath.ac.uk>

On Mon, 2015-10-26 at 14:22 -0800, Wayne Sawdon wrote:

[SNIP]

> 
> 
> > can you do a synthetic recreate of the TSM HSM tape from backups?
> 
> TSM stores data from backups and data from HSM in different pools. A
> file that is both HSM'ed and backed up will have at least two copies
> of data off-line. I suspect that losing a tape from the HSM pool will
> have no effect on the backup pool, but you should verify that with
> someone from TSM.
> 

I am pretty sure that you have to restore the files first from backup,
and it is a manual process. Least it was for me when a HSM tape went bad
in the past. Had to use TSM to generate a list of the files on the HSM
tape, and then feed that in to a dsmc restore, before doing a reconcile
and removing the tape from the library for destruction.

Finally all the files where punted back to tape.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From wsawdon at us.ibm.com  Tue Oct 27 15:25:02 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Tue, 27 Oct 2015 07:25:02 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <562F21B7.8040007@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
Message-ID: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>


> From: Stephan Graf <st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/4bcf33c9/attachment-0002.htm>

From S.J.Thompson at bham.ac.uk  Tue Oct 27 17:28:00 2015
From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services))
Date: Tue, 27 Oct 2015 17:28:00 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
Message-ID: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>

Hi,

If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.

>From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.

This appears to mean that quotas have to be double what we actually want to take account of the replication factor.

Is this correct?

Second part of the question. If a file is transferred to tape (or compressed maybe as well), does the file still count against quota, and how much for? As on hsm tape its no longer copies=2. Same for a compressed file, does the compressed file count as the original or compressed size against quota? I.e. Could a user accessing a compressed file suddenly go over quota by accessing the file?

Thanks

Simon

From Robert.Oesterlin at nuance.com  Tue Oct 27 19:48:04 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 27 Oct 2015 19:48:04 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <4E539EE4-596B-441C-9E60-46072E567765@nuance.com>

With Spectrum Scale 4.2 announced, can anyone from IBM comment on what the outlook/process is for fixes and PTFs?

When 4.1.1 came out, 4.1.0.X more or less dies, with 4.1.0.8 being the last level ? yes? Then move to 4.1.1
With 4.1.1 ? we are now at 4.1.1-2 and 4.2 is going to GA on 11/20/2015

Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151027/8ae9efa9/attachment-0002.htm>

From st.graf at fz-juelich.de  Wed Oct 28 08:06:01 2015
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Wed, 28 Oct 2015 09:06:01 +0100
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>
	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>
	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
Message-ID: <563081E9.2090605@fz-juelich.de>

Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:

> From: Stephan Graf <st.graf at fz-juelich.de><mailto:st.graf at fz-juelich.de>

> We are running the mmbackup on an AIX system
> oslevel -s
> 6100-07-10-1415
> Current GPFS build: "4.1.0.8 ".
>
> So we only use one node for the policy run.
>

Even on one node you should see a speedup using -g and -N.

-Wayne


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/a3c50597/attachment-0002.htm>

From Dan.Foster at bristol.ac.uk  Wed Oct 28 10:06:10 2015
From: Dan.Foster at bristol.ac.uk (Dan Foster)
Date: Wed, 28 Oct 2015 10:06:10 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double
our fileset quotas to take this is to account on replicated
filesystems.

You've reminded me that I was going to ask this list if it's possible
to report the un-replicated sizes? While the quota management is only
a slight pain, what's reported to the user is more of a problem for
us(e.g. via SMB share / df ). We're considering replicating a lot more
of our filesystems and it would be useful if it didn't appear that
everyones quotas had just doubled overnight.

Thanks,
Dan.
-- 
Dan Foster | Senior Storage Systems Administrator | IT Services


From duersch at us.ibm.com  Wed Oct 28 12:47:52 2015
From: duersch at us.ibm.com (Steve Duersch)
Date: Wed, 28 Oct 2015 08:47:52 -0400
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
Message-ID: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes
and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1
stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/b05c24f3/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Wed Oct 28 13:06:56 2015
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 28 Oct 2015 13:06:56 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <B560C3B9-2F1B-4BC5-82E8-704D0E7121C6@nuance.com>

Hi Steve

Thanks ? that?s puzzling (surprising?) given that 4.1.1 hasn?t really been out that long. (less than 6 months)  I?m in a position of deciding of what my upgrade path and timeline should be. If I?m at 4.1.0.X and want to upgrade all my clusters, the ?safer? bet is probably 4.1.1-X. but all the new features are going to end up on the 4.2.X.

If 4.2 is going to GA in November, perhaps it?s better to wait for the first 4.2 PTF package.

Bob Oesterlin
Sr Storage Engineer, Nuance Communications
507-269-0413


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, October 28, 2015 at 7:47 AM
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

IBM will continue to create PTFs for the 4.1.1 stream.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/69dfc30c/attachment-0002.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Oct 28 13:09:52 2015
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 28 Oct 2015 13:09:52 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
Message-ID: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:


>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/c5067f1d/attachment-0002.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:15:30 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:15:30 +0000
Subject: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs
In-Reply-To: <6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
References: <OFDFA28632.8AADEE1A-ON85257EEC.00460409-85257EEC.00464F33@us.ibm.com>
	<6AB4198E-DE7C-4F5D-9C3A-0067C85D1AE0@vanderbilt.edu>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6CF0@CHI-EXCHANGEW1.w2k.jumptrading.com>

IBM has stated that there will no longer be PTF releases for 4.1.0, and that 4.1.0-8 is the last PTF release.  Thus you?ll have to choose between upgrading to 4.1.1 (which has the latest GPFS Protocols feature, hence the numbering change), or wait and go with the 4.2 release.

I heard rumor from somebody at IBM (honestly can?t remember who) that the first 3 releases of any major release has some additional debugging turned up, which is turned off after on the fourth PTF release and those going forward.  Does anybody at IBM want to confirm or deny this rumor?

I?m also leery of going with the first major release of GPFS (or any software, like RHEL 7.0 for instance).

Thanks,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L
Sent: Wednesday, October 28, 2015 8:10 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Spectrum Scale 4.2 - upgrade and ongoing PTFs

All,

What about the 4.1.0-x stream?  We?re on 4.1.0-8 and will soon be applying an efix to it to take care of the snapshot deletion and ?quotas are wrong? bugs.  We?ve also go no immediate plans to go to either 4.1.1-x or 4.2 until they?ve had a chance to ? mature.

It?s not that big of a deal - I don?t mind running on the efix for a while.  Just curious.  Thanks?

Kevin

On Oct 28, 2015, at 7:47 AM, Steve Duersch <duersch at us.ibm.com<mailto:duersch at us.ibm.com>> wrote:

>>Is the plan to ?encourage? the upgrade to 4.2, meaning if you want fixes and are at 4.1.1-x, you move to 4.2, or will IBM continue to PTF the 4.1.1 stream for the foreseeable future?

IBM will continue to create PTFs for the 4.1.1 stream.


Steve Duersch
Spectrum Scale (GPFS) FVTest
IBM Poughkeepsie, New York

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2d765948/attachment-0002.htm>

From bbanister at jumptrading.com  Wed Oct 28 13:25:27 2015
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 28 Oct 2015 13:25:27 +0000
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
	<CALRUX+j-p7BR5AKHpQgDUTQJdybYH93WWdeTQg_wEJrOs_XQ0g@mail.gmail.com>
Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB05CF6E05@CHI-EXCHANGEW1.w2k.jumptrading.com>

I'm not sure what kind of report you're looking for, but the `du` command has a "--apparent-size" option that has this description:
              print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (?sparse?) files, internal fragmentation, indirect blocks, and the like

This can be used to get the actual amount of space that files are using.

I think that mmrepquota and mmlsquota show twice the amount of space of the actual file due to the replication, but somebody correct me if I'm mistaken.

I also would like to know what the output of the ILM "LIST" policy reports for KB_ALLOCATED for replicated files.  Is it the replicated amount of data?

Thanks,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Dan Foster
Sent: Wednesday, October 28, 2015 5:06 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Quotas, replication and hsm

On 27 October 2015 at 17:28, Simon Thompson (Research Computing - IT
Services) <S.J.Thompson at bham.ac.uk> wrote:
> Hi,
>
> If we have replication enabled on a file, does the size from ls -l or du return the actual file size, or the replicated file size (I.e. Twice the actual size)?.
>
> From experimentation, it appears to be double the actual size, I.e. Taking into account replication of 2.
>
> This appears to mean that quotas have to be double what we actually want to take account of the replication factor.
>
> Is this correct?

This is what we obverse here by default and currently have to double our fileset quotas to take this is to account on replicated filesystems.

You've reminded me that I was going to ask this list if it's possible to report the un-replicated sizes? While the quota management is only a slight pain, what's reported to the user is more of a problem for us(e.g. via SMB share / df ). We're considering replicating a lot more of our filesystems and it would be useful if it didn't appear that everyones quotas had just doubled overnight.

Thanks,
Dan.
--
Dan Foster | Senior Storage Systems Administrator | IT Services _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From wsawdon at us.ibm.com  Wed Oct 28 13:36:27 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 05:36:27 -0800
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <563081E9.2090605@fz-juelich.de>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>
	<563081E9.2090605@fz-juelich.de>
Message-ID: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>


You have to use both options even if -N is only the local node. Sorry,

-Wayne


From:	Stephan Graf <st.graf at fz-juelich.de>
To:	<gpfsug-discuss at spectrumscale.org>
Date:	10/28/2015 01:06 AM
Subject:	Re: [gpfsug-discuss] ILM and Backup Question
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Wayne!

We are using -g, and we only want to run it on one node, so we don't use
the -N option.

Stephan

On 10/27/15 16:25, Wayne Sawdon wrote:


      > From: Stephan Graf <st.graf at fz-juelich.de>

      > We are running the mmbackup on an AIX system
      > oslevel -s
      > 6100-07-10-1415
      > Current GPFS build: "4.1.0.8 ".
      >
      > So we only use one node for the policy run.
      >

      Even on one node you should see a speedup using -g and -N.

      -Wayne


      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at spectrumscale.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/4f3f0b9e/attachment-0002.gif>

From wsawdon at us.ibm.com  Wed Oct 28 14:11:25 2015
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 28 Oct 2015 06:11:25 -0800
Subject: [gpfsug-discuss] Quotas, replication and hsm
In-Reply-To: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
References: <CF45EE16DEF2FE4B9AA7FF2B6EE26545D168DFD4@EX10.adf.bham.ac.uk>
Message-ID: <201510281412.t9SEChQo030691@d01av03.pok.ibm.com>


> From: "Simon Thompson (Research Computing - IT Services)"
> <S.J.Thompson at bham.ac.uk>
>
> Second part of the question. If a file is transferred to tape (or
> compressed maybe as well), does the file still count against quota,
> and how much for? As on hsm tape its no longer copies=2. Same for a
> compressed file, does the compressed file count as the original or
> compressed size against quota? I.e. Could a user accessing a
> compressed file suddenly go over quota by accessing the file?
>

Quotas account for space in the file system. If you migrate a user's file
to tape, then that user is credited for the space saved. If a later access
recalls the file then the user is again charged for the space. Note that
HSM recall is done as "root" which bypasses the quota check -- this allows
the file to be recalled even if it pushes the user past his quota limit.

Compression (which is currently in beta) has the same properties. If you
compress a file, then the user is credited with the space saved. When the
file is uncompressed the user is again charged. Since uncompression is done
by the "user" the quota check is enforced and uncompression can fail. This
includes writes to a compressed file.


> From: Bryan Banister <bbanister at jumptrading.com>
>
> I also would like to know what the output of the ILM "LIST" policy
> reports for KB_ALLOCATED for replicated files.  Is it the replicated
> amount of data?
>
KB_ALLOCATED shows the same value that stat shows, So yes it shows the
replicated amount of data actually used by the file.


-Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/f7c66e9b/attachment-0002.htm>

From makaplan at us.ibm.com  Wed Oct 28 14:48:11 2015
From: makaplan at us.ibm.com (makaplan at us.ibm.com)
Date: Wed, 28 Oct 2015 09:48:11 -0500
Subject: [gpfsug-discuss] ILM and Backup Question
In-Reply-To: <201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
References: <81E9FF09-D666-4BD1-A727-39AF4ED1F54B@iu.edu>	<562DE7B5.7080303@fz-juelich.de>
	<201510262114.t9QLENpG024083@d01av01.pok.ibm.com>	<562F21B7.8040007@fz-juelich.de>
	<201510271526.t9RFQ2Bw027971@d03av02.boulder.ibm.com>	<563081E9.2090605@fz-juelich.de>
	<201510281336.t9SDaiNa015723@d01av01.pok.ibm.com>
Message-ID: <201510281448.t9SEmFsr030044@d01av02.pok.ibm.com>

IF you see one or more status messages like this:

[I] %2$s Parallel-piped sort and policy evaluation. %1$llu files scanned. 
%3$s

Then you are getting the (potentially) fastest version of the GPFS inode 
and policy scanning algorithm.
You may also want to adjust the -a and -A options of the mmapplypolicy 
command, as mentioned in the command documentation.

Oh I see the documentation for -A is wrong in many versions of the manual. 
 There is an attempt to automagically estimate the proper number of
buckets, based on the inodes allocated count.  If you want to investigate 
performance more I recommend you use our debug option -d 7 or set 
environment
variable MM_POLICY_DEBUG_BITS=7 - this will show you how the work is 
divided among the nodes and threads.

--marc of GPFS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151028/2dc8f229/attachment-0002.htm>

From knop at us.ibm.com  Thu Oct 29 14:14:58 2015
From: knop at us.ibm.com (Felipe Knop)
Date: Thu, 29 Oct 2015 09:14:58 -0500
Subject: [gpfsug-discuss] Intro (new member)
Message-ID: <OFEF991404.2B0D7580-ON85257EED.004CBAEA-85257EED.004E448D@notes.na.collabserv.com>

Hi,

I have just joined the GPFS (Spectrum Scale) UG list. I work in the GPFS 
development team.

I had the chance of attending the "Inaugural USA Meet the Devs" session in 
New York City on Oct 7, which was a valuable opportunity to hear from 
customers using the product.

  Felipe

----
Felipe Knop                                     knop at us.ibm.com
GPFS Development
IBM Systems
IBM Building 008
2455 South Rd, Poughkeepsie, NY 12601
(845) 433-9314  T/L 293-9314


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151029/fbb17b8a/attachment-0002.htm>

From carlz at us.ibm.com  Fri Oct 30 15:14:50 2015
From: carlz at us.ibm.com (Carl Zetie)
Date: Fri, 30 Oct 2015 10:14:50 -0500
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
Message-ID: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>

First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/15f2a056/attachment-0002.htm>

From jfhamano at us.ibm.com  Fri Oct 30 15:29:58 2015
From: jfhamano at us.ibm.com (John Hamano)
Date: Fri, 30 Oct 2015 07:29:58 -0800
Subject: [gpfsug-discuss] Making an RFE Public (and an intro)
In-Reply-To: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
References: <201510301520.t9UFKTUP032501@d01av04.pok.ibm.com>
Message-ID: <201510301530.t9UFUM0M004729@d03av05.boulder.ibm.com>

Hi Carl, welcome and congratulations on your new role.  I am North America 
Brand Sales for ESS and Spectrum Scale.   Let me know when you have some 
time next weekg to talk.


From:   Carl Zetie/Fairfax/IBM at IBMUS
To:     gpfsug-discuss at spectrumscale.org, 
Date:   10/30/2015 08:20 AM
Subject:        [gpfsug-discuss] Making an RFE Public (and an intro)
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


First the intro: I am the new Product Manager joining the Spectrum Scale 
team, taking the place of Janet Ellsworth. I'm looking forward to meeting 
with you all.

I also have some news about RFEs: we are working to enable you to choose 
whether your RFEs for Scale are private or public. I know that many of you 
have requested public RFEs so that other people can see and vote on RFEs. 
We'd like to see that too as it's very valuable information for us (as 
well as reducing duplicates). So here's what we're doing:

Short term:
If you have an existing RFE that you would like to see made Public, please 
email me with the ID of the RFE. You can find my email address at the foot 
of this message. 

PLEASE don't email the entire list!


Medium term:
We are working to allow you to choose at the time of submission whether a 
request will be Private or Public. Unfortunately for technical internal 
reasons we can't simply make the Public / Private field selectable at 
submission time (don't ask!), so instead we are creating two submission 
queues, one for Private RFEs and another for public RFEs. So when you 
submit an RFE in future you'll start by selecting the appropriate queue. 
Inside IBM, they all go into the same evaluation process.

As soon as I have an update on the availability of this fix, I will share 
with the group.


Note that even for Public requests, some fields remain Private and hidden 
from other viewers, e.g. Business Case (look for the "key" icon next to 
the field to confirm).

regards,
Carl


Carl Zetie
Product Manager for Spectrum Scale, IBM

(540) 882 9353  ][  15750 Brookhill Ct, Waterford VA 20197
carlz at us.ibm.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151030/4527c35a/attachment-0002.htm>