From Jared.Baker at uwyo.edu  Fri Jan  2 18:37:19 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 18:37:19 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
Message-ID: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Hello GPFS admins! I hope everybody had a great start to the new year so far.

Lately, I've had a few of my users get an error similar to:

      error creating file: no space left on device.


When trying to create even simple files (using Linux `touch` command). However, if they try again in a second or two, the file is created without a problem and they go on about doing their work. I can never tell when they are likely to get the error message about 'no space left on device'. The filesystem creates many files in parallel (depending on the usage of the system and movement of files from other sites)

However, let me first describe our environment a little better. We have a 3 GPFS file systems (home, project, gscratch) on RHEL 6.3 InfiniBand HPC cluster. The version of GPFS is 3.5.0-11. We utilize fileset quotas (on block limits, not file limits) for each file system. Each user has a home fileset for storing basic configuration files, basic notes, and other small files. Each user belongs to a minimum of one project and the quota is shared between the users of the project. The gscratch file system is similar to that of the project file system except that files are deleted after ~9 days.

The partially good news (perhaps) is that the error mentioned above only occurs on the project file system. We've at least not observed the error on the home and gscratch file systems. Here's my initial investigation so far:


1.)Checked the fileset quota on one of the experienced filesets:

--
# mmlsquota -j ModMast project
                         Block Limits                                    |     File Limits
Filesystem type             KB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
project    FILESET   953382016          0 16106127360          0     none |  8666828       0        0        0     none
--

It would seem from the information that the project is indeed well under their quota for their particular project.


2.)Then I checked the overall file system to see if the capacity/inode is nearly full:

--
# mmdf project
disk                disk size  failure holds    holds              free KB             free KB
name                    in KB    group metadata data        in full blocks        in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 397 TB)
U01_L0            15623913472       -1 Yes      Yes      7404335104 ( 47%)     667820032 ( 4%)
U01_L1            15623913472       -1 Yes      Yes      7498215424 ( 48%)     642773120 ( 4%)
U01_L2            15623913472       -1 Yes      Yes      7497969664 ( 48%)     642664576 ( 4%)
U01_L3            15623913472       -1 Yes      Yes      7496232960 ( 48%)     644327936 ( 4%)
U01_L4            15623913472       -1 Yes      Yes      7499296768 ( 48%)     640117376 ( 4%)
U01_L5            15623913472       -1 Yes      Yes      7494881280 ( 48%)     644168320 ( 4%)
U01_L6            15623913472       -1 Yes      Yes      7494164480 ( 48%)     643673216 ( 4%)
U01_L7            15623913472       -1 Yes      Yes      7497433088 ( 48%)     639918976 ( 4%)
U01_L8            15623913472       -1 Yes      Yes      7494139904 ( 48%)     645130240 ( 4%)
U01_L9            15623913472       -1 Yes      Yes      7498375168 ( 48%)     639979520 ( 4%)
U01_L10           15623913472       -1 Yes      Yes      7496028160 ( 48%)     641909632 ( 4%)
U01_L11           15623913472       -1 Yes      Yes      7496093696 ( 48%)     643749504 ( 4%)
U01_L12           15623913472       -1 Yes      Yes      7496425472 ( 48%)     641556992 ( 4%)
U01_L13           15623913472       -1 Yes      Yes      7495516160 ( 48%)     643395840 ( 4%)
U01_L14           15623913472       -1 Yes      Yes      7496908800 ( 48%)     642418816 ( 4%)
U01_L15           15623913472       -1 Yes      Yes      7495823360 ( 48%)     643580416 ( 4%)
U01_L16           15623913472       -1 Yes      Yes      7499939840 ( 48%)     641538688 ( 4%)
U01_L17           15623913472       -1 Yes      Yes      7497355264 ( 48%)     642184704 ( 4%)
U13_L0             2339553280       -1 Yes      No       2322395136 ( 99%)       8190848 ( 0%)
U13_L1             2339553280       -1 Yes      No       2322411520 ( 99%)       8189312 ( 0%)
U13_L12           15623921664       -1 Yes      Yes      7799422976 ( 50%)     335150208 ( 2%)
U13_L13           15623921664       -1 Yes      Yes      8002662400 ( 51%)     126059264 ( 1%)
U13_L14           15623921664       -1 Yes      Yes      8001093632 ( 51%)     126107648 ( 1%)
U13_L15           15623921664       -1 Yes      Yes      8001732608 ( 51%)     126167168 ( 1%)
U13_L16           15623921664       -1 Yes      Yes      8000077824 ( 51%)     126240768 ( 1%)
U13_L17           15623921664       -1 Yes      Yes      8001458176 ( 51%)     126068480 ( 1%)
U13_L18           15623921664       -1 Yes      Yes      7998636032 ( 51%)     127111680 ( 1%)
U13_L19           15623921664       -1 Yes      Yes      8001892352 ( 51%)     125148928 ( 1%)
U13_L20           15623921664       -1 Yes      Yes      8001916928 ( 51%)     126187904 ( 1%)
U13_L21           15623921664       -1 Yes      Yes      8002568192 ( 51%)     126591616 ( 1%)
                -------------                         -------------------- -------------------
(pool total)     442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

                =============                         ==================== ===================
(data)           437469659136                          214660595712 ( 49%)   13061741568 ( 3%)
(metadata)       442148765696                          219305402368 ( 50%)   13078121728 ( 3%)
                =============                         ==================== ===================
(total)          442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

Inode Information
-----------------
Number of used inodes:       133031523
Number of free inodes:         1186205
Number of allocated inodes:  134217728
Maximum number of inodes:    134217728
--

Eureka! From here it seems that the inode capacity is teetering on its limit. I think at this point it would be best to educate our users on not writing millions of small text files as I don't think it is possible to adjust the GPFS block size to something lower (block size is currently 4MB). The system was originally targeted at large read/writes from traditional HPC users, but we have now diversified our user base to include other computing areas outside traditional HPC. Documentation states that if parallel writes are to be done, that a minimum of 5% of the inodes need to be free otherwise performance will suffer. From above, we have less than 1% free which I think is the root of our problem.

Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I've man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

The man page describes that the limit is:

      max_files = total_filesystem_space / (inode_size + subblock_size)

and the subblock size defined from IBM's website as 1/32 of the block size (which is 4MB). Therefore, I calculate that the maximum number of inodes I could potentially have is:

      3440846425

Which is approximately 25x the current maximum, so I think there is reason that I can increase the inode count without too much worry. Are there any caveats to my logic here? I'm not saying I'll increase it to the maximum value right away because the inode space would take away from some usable capacity of the system.

Thanks for any comments and recommendations. I have a grand size maintenance period coming up due to datacenter power upgrades and I'll be given ~2 weeks of down time for maintenance which I'm trying to get all my ducks in line and if I need to do something time consuming with the file systems, I'd like to know ahead of time so I can do it during the maintenance window as I will probably not get another one window for many months after.

Again, thank you all!

Jared Baker
ARCC


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/ede3f500/attachment.htm>

From oester at gmail.com  Fri Jan  2 18:52:50 2015
From: oester at gmail.com (Bob Oesterlin)
Date: Fri, 2 Jan 2015 12:52:50 -0600
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
Message-ID: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>

I increase the inode limit on my file systems on a regular basis with no
problems. I do have my data/metadata split into separate pools, but the
procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu>
wrote:

>  Hello GPFS admins! I hope everybody had a great start to the new year so
> far.
>
>
>
>
>
> Therefore, is there a method to safely increase the maximum inode count
> and could it be done during operation or should the system be unmounted?
> I?ve man paged / searched online and found a few hints suggesting below but
> was curious about its safety during operation:
>
>
>
>       mmchfs project --inode-limit <new_max_inode_count>
>
>
>
> Again, thank you all!
>
>
>
> Jared Baker
>
> ARCC
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/499801fa/attachment.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 21:06:30 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 21:06:30 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
	<CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
Message-ID: <CY1PR0501MB125993939F7A0C37DA4172559D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Bob, thanks for the quick reply! I had a feeling is was a fairly safe operation. Being that I joined the group post file system decision, we?ve had many discussions about different storage pools for data/metadata but mostly concluded that we would worry about it more critically on the next system at this point. Again, thanks.

Jared Baker
ARCC

From: Bob Oesterlin [mailto:oester at gmail.com]
Sent: Friday, January 02, 2015 11:53 AM
To: gpfsug main discussion list
Cc: Jared David Baker
Subject: Re: [gpfsug-discuss] Question about changing inode capacity safely

I increase the inode limit on my file systems on a regular basis with no problems. I do have my data/metadata split into separate pools, but the procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu<mailto:Jared.Baker at uwyo.edu>> wrote:
Hello GPFS admins! I hope everybody had a great start to the new year so far.


Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I?ve man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

Again, thank you all!

Jared Baker
ARCC


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/1a382a4d/attachment.htm>

From bruno.silva at crick.ac.uk  Wed Jan  7 14:45:16 2015
From: bruno.silva at crick.ac.uk (Bruno Silva)
Date: Wed, 7 Jan 2015 14:45:16 +0000
Subject: [gpfsug-discuss] Another GPFS + Open Stack user
Message-ID: <D0D2F6FA.703B%bruno.silva@crick.ac.uk>

Hello,

My name is Bruno Silva and I am HPC Lead for the Francis Crick Institute - www.crick.ac.uk

We are participating in eMedlab, an MRC funded collaborative project to support computational biomedical research between UCL, the Crick, Queen Mary University of London, London School of Hygiene and Tropical Medicine.

We are very interested in the integration of GPFS and Open Stack, and particularly how to provide maximum performance securely on VMs.

I believe that Orlando Richards, Simon Thompson, and others have brought this issue to the mailing list and that there will have been useful discussions already on this matter, so we will be very happy to bring any information that might be useful to the discussion.

At the moment we are considering the possibility of automounting GPFS clients on VMs using heat templates and restricting root access to those machines. GPFS would be available through a provider network in Open Stack. I suspect this is a very na?ve approach, but it would be interesting to hear what your thoughts on this idea are.

Many thanks,
Bruno

___________________________________
Dr Bruno Silva
High Performance Computing Lead
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE

T: 020 7611 2117
E: Bruno.Silva at crick.ac.uk
W: www.crick.ac.uk<http://www.crick.ac.uk/>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no.06885462, with its registered office at 215 Euston Road, London NW1 2BE.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150107/550ec4e9/attachment.htm>

From chair at gpfsug.org  Wed Jan 21 16:00:13 2015
From: chair at gpfsug.org (Jez Tucker (Chair))
Date: Wed, 21 Jan 2015 16:00:13 +0000
Subject: [gpfsug-discuss] GPFS UG in 2015
Message-ID: <54BFCD0D.70101@gpfsug.org>

Hello everyone

   The first of a few official posts for 2015.

Updates:

We have an even more improved relationship with the IBM GPFS team in 
2015 than 2014.

IBM have directly engaged with us at the highest level and we're working 
together to be able to bring you better news, features and technical 
insight to GPFS - in addition to organising the next User Group.

With that in mind, here's a few things that will be occurring this year:


*User Group Meeting*

The date for this is in the process of being finalised right now.
It is likely to be April / May.
We will update you as soon as we have a definitive.

*
'Meet the Devs Coffee Shop'

*The IBM devs would love to speak face to face with you all.

So we thought, what better way to do that than in small groups over 
coffee and pizza?
The aim will be to have small meetings with bunches of geographically 
similar users on a regular basis.

The first ('tester') will be _Wednesday 18th Feb_ in London and I'll 
send out a formal invite email shortly.

For those of you further afield, we intend to come to you (yes! Orlando 
in Edinburgh) and hopefully even EMEA / Worldwide if possible.

These small groups will be ideal situations for IBM devs to show you new 
features, what they're working on and solicit feedback from experienced 
GPFS users/admins.

Oh and the Pizza.


*Website revamp*  (yes, another..)

I'm in the process of moving all the content over to Ghost. 
(https://ghost.org/)
We'll flip the site over at some point this week.

Shortly you'll see our first blog contribution from IBM dev.

/What would topics would you like covered? / Let the group know.

Also if anyone (member, dev, tech, etc.) would like to contribute 
technical blog posts regarding GPFS or related softwares/technologies, 
then let me know and we'll set you up an account.


All the best,


Jez (Chair) and Claire (Secretary)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150121/4406e678/attachment.htm>

From chair at gpfsug.org  Fri Jan 23 21:21:43 2015
From: chair at gpfsug.org (Jez Tucker)
Date: Fri, 23 Jan 2015 21:21:43 +0000
Subject: [gpfsug-discuss] User Group speaker for IBM Edge 2015 ?
Message-ID: <54C2BB67.2000806@gpfsug.org>

Hi all

   IBM GPFS Team has asked if anyone from our worldwide membership would 
like to go to IBM Edge and talk about their own experience with GPFS.

Perhaps you've replaced HDFS with GPFS, implemented end-to-end 
GPFS+TSM/LTFS-EE, utilised Hot Files and Local Read Only Cache (v4.1) or 
Flash tiers or you've migrated from SONAS to Elastic?

Or maybe you have large video work-flow, OpenStack SWIFT and Object, or 
have a hybrid GPFS with Elastic/GSS?


If you're interested in participating, please drop Ross Keeping 
(ross.keeping at uk.ibm.com) an email ASAP.

Call for speakers closes on 6th Feb 2015.
Apologies for the short notice.

For further information, please see the Edge 2015 bulletin 
<http://goo.gl/h5RbXi> http://goo.gl/h5RbXi

Jez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150123/c9be4a19/attachment.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 18:37:19 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 18:37:19 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
Message-ID: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Hello GPFS admins! I hope everybody had a great start to the new year so far.

Lately, I've had a few of my users get an error similar to:

      error creating file: no space left on device.


When trying to create even simple files (using Linux `touch` command). However, if they try again in a second or two, the file is created without a problem and they go on about doing their work. I can never tell when they are likely to get the error message about 'no space left on device'. The filesystem creates many files in parallel (depending on the usage of the system and movement of files from other sites)

However, let me first describe our environment a little better. We have a 3 GPFS file systems (home, project, gscratch) on RHEL 6.3 InfiniBand HPC cluster. The version of GPFS is 3.5.0-11. We utilize fileset quotas (on block limits, not file limits) for each file system. Each user has a home fileset for storing basic configuration files, basic notes, and other small files. Each user belongs to a minimum of one project and the quota is shared between the users of the project. The gscratch file system is similar to that of the project file system except that files are deleted after ~9 days.

The partially good news (perhaps) is that the error mentioned above only occurs on the project file system. We've at least not observed the error on the home and gscratch file systems. Here's my initial investigation so far:


1.)Checked the fileset quota on one of the experienced filesets:

--
# mmlsquota -j ModMast project
                         Block Limits                                    |     File Limits
Filesystem type             KB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
project    FILESET   953382016          0 16106127360          0     none |  8666828       0        0        0     none
--

It would seem from the information that the project is indeed well under their quota for their particular project.


2.)Then I checked the overall file system to see if the capacity/inode is nearly full:

--
# mmdf project
disk                disk size  failure holds    holds              free KB             free KB
name                    in KB    group metadata data        in full blocks        in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 397 TB)
U01_L0            15623913472       -1 Yes      Yes      7404335104 ( 47%)     667820032 ( 4%)
U01_L1            15623913472       -1 Yes      Yes      7498215424 ( 48%)     642773120 ( 4%)
U01_L2            15623913472       -1 Yes      Yes      7497969664 ( 48%)     642664576 ( 4%)
U01_L3            15623913472       -1 Yes      Yes      7496232960 ( 48%)     644327936 ( 4%)
U01_L4            15623913472       -1 Yes      Yes      7499296768 ( 48%)     640117376 ( 4%)
U01_L5            15623913472       -1 Yes      Yes      7494881280 ( 48%)     644168320 ( 4%)
U01_L6            15623913472       -1 Yes      Yes      7494164480 ( 48%)     643673216 ( 4%)
U01_L7            15623913472       -1 Yes      Yes      7497433088 ( 48%)     639918976 ( 4%)
U01_L8            15623913472       -1 Yes      Yes      7494139904 ( 48%)     645130240 ( 4%)
U01_L9            15623913472       -1 Yes      Yes      7498375168 ( 48%)     639979520 ( 4%)
U01_L10           15623913472       -1 Yes      Yes      7496028160 ( 48%)     641909632 ( 4%)
U01_L11           15623913472       -1 Yes      Yes      7496093696 ( 48%)     643749504 ( 4%)
U01_L12           15623913472       -1 Yes      Yes      7496425472 ( 48%)     641556992 ( 4%)
U01_L13           15623913472       -1 Yes      Yes      7495516160 ( 48%)     643395840 ( 4%)
U01_L14           15623913472       -1 Yes      Yes      7496908800 ( 48%)     642418816 ( 4%)
U01_L15           15623913472       -1 Yes      Yes      7495823360 ( 48%)     643580416 ( 4%)
U01_L16           15623913472       -1 Yes      Yes      7499939840 ( 48%)     641538688 ( 4%)
U01_L17           15623913472       -1 Yes      Yes      7497355264 ( 48%)     642184704 ( 4%)
U13_L0             2339553280       -1 Yes      No       2322395136 ( 99%)       8190848 ( 0%)
U13_L1             2339553280       -1 Yes      No       2322411520 ( 99%)       8189312 ( 0%)
U13_L12           15623921664       -1 Yes      Yes      7799422976 ( 50%)     335150208 ( 2%)
U13_L13           15623921664       -1 Yes      Yes      8002662400 ( 51%)     126059264 ( 1%)
U13_L14           15623921664       -1 Yes      Yes      8001093632 ( 51%)     126107648 ( 1%)
U13_L15           15623921664       -1 Yes      Yes      8001732608 ( 51%)     126167168 ( 1%)
U13_L16           15623921664       -1 Yes      Yes      8000077824 ( 51%)     126240768 ( 1%)
U13_L17           15623921664       -1 Yes      Yes      8001458176 ( 51%)     126068480 ( 1%)
U13_L18           15623921664       -1 Yes      Yes      7998636032 ( 51%)     127111680 ( 1%)
U13_L19           15623921664       -1 Yes      Yes      8001892352 ( 51%)     125148928 ( 1%)
U13_L20           15623921664       -1 Yes      Yes      8001916928 ( 51%)     126187904 ( 1%)
U13_L21           15623921664       -1 Yes      Yes      8002568192 ( 51%)     126591616 ( 1%)
                -------------                         -------------------- -------------------
(pool total)     442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

                =============                         ==================== ===================
(data)           437469659136                          214660595712 ( 49%)   13061741568 ( 3%)
(metadata)       442148765696                          219305402368 ( 50%)   13078121728 ( 3%)
                =============                         ==================== ===================
(total)          442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

Inode Information
-----------------
Number of used inodes:       133031523
Number of free inodes:         1186205
Number of allocated inodes:  134217728
Maximum number of inodes:    134217728
--

Eureka! From here it seems that the inode capacity is teetering on its limit. I think at this point it would be best to educate our users on not writing millions of small text files as I don't think it is possible to adjust the GPFS block size to something lower (block size is currently 4MB). The system was originally targeted at large read/writes from traditional HPC users, but we have now diversified our user base to include other computing areas outside traditional HPC. Documentation states that if parallel writes are to be done, that a minimum of 5% of the inodes need to be free otherwise performance will suffer. From above, we have less than 1% free which I think is the root of our problem.

Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I've man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

The man page describes that the limit is:

      max_files = total_filesystem_space / (inode_size + subblock_size)

and the subblock size defined from IBM's website as 1/32 of the block size (which is 4MB). Therefore, I calculate that the maximum number of inodes I could potentially have is:

      3440846425

Which is approximately 25x the current maximum, so I think there is reason that I can increase the inode count without too much worry. Are there any caveats to my logic here? I'm not saying I'll increase it to the maximum value right away because the inode space would take away from some usable capacity of the system.

Thanks for any comments and recommendations. I have a grand size maintenance period coming up due to datacenter power upgrades and I'll be given ~2 weeks of down time for maintenance which I'm trying to get all my ducks in line and if I need to do something time consuming with the file systems, I'd like to know ahead of time so I can do it during the maintenance window as I will probably not get another one window for many months after.

Again, thank you all!

Jared Baker
ARCC


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/ede3f500/attachment-0001.htm>

From oester at gmail.com  Fri Jan  2 18:52:50 2015
From: oester at gmail.com (Bob Oesterlin)
Date: Fri, 2 Jan 2015 12:52:50 -0600
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
Message-ID: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>

I increase the inode limit on my file systems on a regular basis with no
problems. I do have my data/metadata split into separate pools, but the
procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu>
wrote:

>  Hello GPFS admins! I hope everybody had a great start to the new year so
> far.
>
>
>
>
>
> Therefore, is there a method to safely increase the maximum inode count
> and could it be done during operation or should the system be unmounted?
> I?ve man paged / searched online and found a few hints suggesting below but
> was curious about its safety during operation:
>
>
>
>       mmchfs project --inode-limit <new_max_inode_count>
>
>
>
> Again, thank you all!
>
>
>
> Jared Baker
>
> ARCC
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/499801fa/attachment-0001.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 21:06:30 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 21:06:30 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
	<CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
Message-ID: <CY1PR0501MB125993939F7A0C37DA4172559D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Bob, thanks for the quick reply! I had a feeling is was a fairly safe operation. Being that I joined the group post file system decision, we?ve had many discussions about different storage pools for data/metadata but mostly concluded that we would worry about it more critically on the next system at this point. Again, thanks.

Jared Baker
ARCC

From: Bob Oesterlin [mailto:oester at gmail.com]
Sent: Friday, January 02, 2015 11:53 AM
To: gpfsug main discussion list
Cc: Jared David Baker
Subject: Re: [gpfsug-discuss] Question about changing inode capacity safely

I increase the inode limit on my file systems on a regular basis with no problems. I do have my data/metadata split into separate pools, but the procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu<mailto:Jared.Baker at uwyo.edu>> wrote:
Hello GPFS admins! I hope everybody had a great start to the new year so far.


Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I?ve man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

Again, thank you all!

Jared Baker
ARCC


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/1a382a4d/attachment-0001.htm>

From bruno.silva at crick.ac.uk  Wed Jan  7 14:45:16 2015
From: bruno.silva at crick.ac.uk (Bruno Silva)
Date: Wed, 7 Jan 2015 14:45:16 +0000
Subject: [gpfsug-discuss] Another GPFS + Open Stack user
Message-ID: <D0D2F6FA.703B%bruno.silva@crick.ac.uk>

Hello,

My name is Bruno Silva and I am HPC Lead for the Francis Crick Institute - www.crick.ac.uk

We are participating in eMedlab, an MRC funded collaborative project to support computational biomedical research between UCL, the Crick, Queen Mary University of London, London School of Hygiene and Tropical Medicine.

We are very interested in the integration of GPFS and Open Stack, and particularly how to provide maximum performance securely on VMs.

I believe that Orlando Richards, Simon Thompson, and others have brought this issue to the mailing list and that there will have been useful discussions already on this matter, so we will be very happy to bring any information that might be useful to the discussion.

At the moment we are considering the possibility of automounting GPFS clients on VMs using heat templates and restricting root access to those machines. GPFS would be available through a provider network in Open Stack. I suspect this is a very na?ve approach, but it would be interesting to hear what your thoughts on this idea are.

Many thanks,
Bruno

___________________________________
Dr Bruno Silva
High Performance Computing Lead
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE

T: 020 7611 2117
E: Bruno.Silva at crick.ac.uk
W: www.crick.ac.uk<http://www.crick.ac.uk/>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no.06885462, with its registered office at 215 Euston Road, London NW1 2BE.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150107/550ec4e9/attachment-0001.htm>

From chair at gpfsug.org  Wed Jan 21 16:00:13 2015
From: chair at gpfsug.org (Jez Tucker (Chair))
Date: Wed, 21 Jan 2015 16:00:13 +0000
Subject: [gpfsug-discuss] GPFS UG in 2015
Message-ID: <54BFCD0D.70101@gpfsug.org>

Hello everyone

   The first of a few official posts for 2015.

Updates:

We have an even more improved relationship with the IBM GPFS team in 
2015 than 2014.

IBM have directly engaged with us at the highest level and we're working 
together to be able to bring you better news, features and technical 
insight to GPFS - in addition to organising the next User Group.

With that in mind, here's a few things that will be occurring this year:


*User Group Meeting*

The date for this is in the process of being finalised right now.
It is likely to be April / May.
We will update you as soon as we have a definitive.

*
'Meet the Devs Coffee Shop'

*The IBM devs would love to speak face to face with you all.

So we thought, what better way to do that than in small groups over 
coffee and pizza?
The aim will be to have small meetings with bunches of geographically 
similar users on a regular basis.

The first ('tester') will be _Wednesday 18th Feb_ in London and I'll 
send out a formal invite email shortly.

For those of you further afield, we intend to come to you (yes! Orlando 
in Edinburgh) and hopefully even EMEA / Worldwide if possible.

These small groups will be ideal situations for IBM devs to show you new 
features, what they're working on and solicit feedback from experienced 
GPFS users/admins.

Oh and the Pizza.


*Website revamp*  (yes, another..)

I'm in the process of moving all the content over to Ghost. 
(https://ghost.org/)
We'll flip the site over at some point this week.

Shortly you'll see our first blog contribution from IBM dev.

/What would topics would you like covered? / Let the group know.

Also if anyone (member, dev, tech, etc.) would like to contribute 
technical blog posts regarding GPFS or related softwares/technologies, 
then let me know and we'll set you up an account.


All the best,


Jez (Chair) and Claire (Secretary)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150121/4406e678/attachment-0001.htm>

From chair at gpfsug.org  Fri Jan 23 21:21:43 2015
From: chair at gpfsug.org (Jez Tucker)
Date: Fri, 23 Jan 2015 21:21:43 +0000
Subject: [gpfsug-discuss] User Group speaker for IBM Edge 2015 ?
Message-ID: <54C2BB67.2000806@gpfsug.org>

Hi all

   IBM GPFS Team has asked if anyone from our worldwide membership would 
like to go to IBM Edge and talk about their own experience with GPFS.

Perhaps you've replaced HDFS with GPFS, implemented end-to-end 
GPFS+TSM/LTFS-EE, utilised Hot Files and Local Read Only Cache (v4.1) or 
Flash tiers or you've migrated from SONAS to Elastic?

Or maybe you have large video work-flow, OpenStack SWIFT and Object, or 
have a hybrid GPFS with Elastic/GSS?


If you're interested in participating, please drop Ross Keeping 
(ross.keeping at uk.ibm.com) an email ASAP.

Call for speakers closes on 6th Feb 2015.
Apologies for the short notice.

For further information, please see the Edge 2015 bulletin 
<http://goo.gl/h5RbXi> http://goo.gl/h5RbXi

Jez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150123/c9be4a19/attachment-0001.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 18:37:19 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 18:37:19 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
Message-ID: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Hello GPFS admins! I hope everybody had a great start to the new year so far.

Lately, I've had a few of my users get an error similar to:

      error creating file: no space left on device.


When trying to create even simple files (using Linux `touch` command). However, if they try again in a second or two, the file is created without a problem and they go on about doing their work. I can never tell when they are likely to get the error message about 'no space left on device'. The filesystem creates many files in parallel (depending on the usage of the system and movement of files from other sites)

However, let me first describe our environment a little better. We have a 3 GPFS file systems (home, project, gscratch) on RHEL 6.3 InfiniBand HPC cluster. The version of GPFS is 3.5.0-11. We utilize fileset quotas (on block limits, not file limits) for each file system. Each user has a home fileset for storing basic configuration files, basic notes, and other small files. Each user belongs to a minimum of one project and the quota is shared between the users of the project. The gscratch file system is similar to that of the project file system except that files are deleted after ~9 days.

The partially good news (perhaps) is that the error mentioned above only occurs on the project file system. We've at least not observed the error on the home and gscratch file systems. Here's my initial investigation so far:


1.)Checked the fileset quota on one of the experienced filesets:

--
# mmlsquota -j ModMast project
                         Block Limits                                    |     File Limits
Filesystem type             KB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
project    FILESET   953382016          0 16106127360          0     none |  8666828       0        0        0     none
--

It would seem from the information that the project is indeed well under their quota for their particular project.


2.)Then I checked the overall file system to see if the capacity/inode is nearly full:

--
# mmdf project
disk                disk size  failure holds    holds              free KB             free KB
name                    in KB    group metadata data        in full blocks        in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 397 TB)
U01_L0            15623913472       -1 Yes      Yes      7404335104 ( 47%)     667820032 ( 4%)
U01_L1            15623913472       -1 Yes      Yes      7498215424 ( 48%)     642773120 ( 4%)
U01_L2            15623913472       -1 Yes      Yes      7497969664 ( 48%)     642664576 ( 4%)
U01_L3            15623913472       -1 Yes      Yes      7496232960 ( 48%)     644327936 ( 4%)
U01_L4            15623913472       -1 Yes      Yes      7499296768 ( 48%)     640117376 ( 4%)
U01_L5            15623913472       -1 Yes      Yes      7494881280 ( 48%)     644168320 ( 4%)
U01_L6            15623913472       -1 Yes      Yes      7494164480 ( 48%)     643673216 ( 4%)
U01_L7            15623913472       -1 Yes      Yes      7497433088 ( 48%)     639918976 ( 4%)
U01_L8            15623913472       -1 Yes      Yes      7494139904 ( 48%)     645130240 ( 4%)
U01_L9            15623913472       -1 Yes      Yes      7498375168 ( 48%)     639979520 ( 4%)
U01_L10           15623913472       -1 Yes      Yes      7496028160 ( 48%)     641909632 ( 4%)
U01_L11           15623913472       -1 Yes      Yes      7496093696 ( 48%)     643749504 ( 4%)
U01_L12           15623913472       -1 Yes      Yes      7496425472 ( 48%)     641556992 ( 4%)
U01_L13           15623913472       -1 Yes      Yes      7495516160 ( 48%)     643395840 ( 4%)
U01_L14           15623913472       -1 Yes      Yes      7496908800 ( 48%)     642418816 ( 4%)
U01_L15           15623913472       -1 Yes      Yes      7495823360 ( 48%)     643580416 ( 4%)
U01_L16           15623913472       -1 Yes      Yes      7499939840 ( 48%)     641538688 ( 4%)
U01_L17           15623913472       -1 Yes      Yes      7497355264 ( 48%)     642184704 ( 4%)
U13_L0             2339553280       -1 Yes      No       2322395136 ( 99%)       8190848 ( 0%)
U13_L1             2339553280       -1 Yes      No       2322411520 ( 99%)       8189312 ( 0%)
U13_L12           15623921664       -1 Yes      Yes      7799422976 ( 50%)     335150208 ( 2%)
U13_L13           15623921664       -1 Yes      Yes      8002662400 ( 51%)     126059264 ( 1%)
U13_L14           15623921664       -1 Yes      Yes      8001093632 ( 51%)     126107648 ( 1%)
U13_L15           15623921664       -1 Yes      Yes      8001732608 ( 51%)     126167168 ( 1%)
U13_L16           15623921664       -1 Yes      Yes      8000077824 ( 51%)     126240768 ( 1%)
U13_L17           15623921664       -1 Yes      Yes      8001458176 ( 51%)     126068480 ( 1%)
U13_L18           15623921664       -1 Yes      Yes      7998636032 ( 51%)     127111680 ( 1%)
U13_L19           15623921664       -1 Yes      Yes      8001892352 ( 51%)     125148928 ( 1%)
U13_L20           15623921664       -1 Yes      Yes      8001916928 ( 51%)     126187904 ( 1%)
U13_L21           15623921664       -1 Yes      Yes      8002568192 ( 51%)     126591616 ( 1%)
                -------------                         -------------------- -------------------
(pool total)     442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

                =============                         ==================== ===================
(data)           437469659136                          214660595712 ( 49%)   13061741568 ( 3%)
(metadata)       442148765696                          219305402368 ( 50%)   13078121728 ( 3%)
                =============                         ==================== ===================
(total)          442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

Inode Information
-----------------
Number of used inodes:       133031523
Number of free inodes:         1186205
Number of allocated inodes:  134217728
Maximum number of inodes:    134217728
--

Eureka! From here it seems that the inode capacity is teetering on its limit. I think at this point it would be best to educate our users on not writing millions of small text files as I don't think it is possible to adjust the GPFS block size to something lower (block size is currently 4MB). The system was originally targeted at large read/writes from traditional HPC users, but we have now diversified our user base to include other computing areas outside traditional HPC. Documentation states that if parallel writes are to be done, that a minimum of 5% of the inodes need to be free otherwise performance will suffer. From above, we have less than 1% free which I think is the root of our problem.

Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I've man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

The man page describes that the limit is:

      max_files = total_filesystem_space / (inode_size + subblock_size)

and the subblock size defined from IBM's website as 1/32 of the block size (which is 4MB). Therefore, I calculate that the maximum number of inodes I could potentially have is:

      3440846425

Which is approximately 25x the current maximum, so I think there is reason that I can increase the inode count without too much worry. Are there any caveats to my logic here? I'm not saying I'll increase it to the maximum value right away because the inode space would take away from some usable capacity of the system.

Thanks for any comments and recommendations. I have a grand size maintenance period coming up due to datacenter power upgrades and I'll be given ~2 weeks of down time for maintenance which I'm trying to get all my ducks in line and if I need to do something time consuming with the file systems, I'd like to know ahead of time so I can do it during the maintenance window as I will probably not get another one window for many months after.

Again, thank you all!

Jared Baker
ARCC


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/ede3f500/attachment-0002.htm>

From oester at gmail.com  Fri Jan  2 18:52:50 2015
From: oester at gmail.com (Bob Oesterlin)
Date: Fri, 2 Jan 2015 12:52:50 -0600
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
Message-ID: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>

I increase the inode limit on my file systems on a regular basis with no
problems. I do have my data/metadata split into separate pools, but the
procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu>
wrote:

>  Hello GPFS admins! I hope everybody had a great start to the new year so
> far.
>
>
>
>
>
> Therefore, is there a method to safely increase the maximum inode count
> and could it be done during operation or should the system be unmounted?
> I?ve man paged / searched online and found a few hints suggesting below but
> was curious about its safety during operation:
>
>
>
>       mmchfs project --inode-limit <new_max_inode_count>
>
>
>
> Again, thank you all!
>
>
>
> Jared Baker
>
> ARCC
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/499801fa/attachment-0002.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 21:06:30 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 21:06:30 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
	<CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
Message-ID: <CY1PR0501MB125993939F7A0C37DA4172559D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Bob, thanks for the quick reply! I had a feeling is was a fairly safe operation. Being that I joined the group post file system decision, we?ve had many discussions about different storage pools for data/metadata but mostly concluded that we would worry about it more critically on the next system at this point. Again, thanks.

Jared Baker
ARCC

From: Bob Oesterlin [mailto:oester at gmail.com]
Sent: Friday, January 02, 2015 11:53 AM
To: gpfsug main discussion list
Cc: Jared David Baker
Subject: Re: [gpfsug-discuss] Question about changing inode capacity safely

I increase the inode limit on my file systems on a regular basis with no problems. I do have my data/metadata split into separate pools, but the procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu<mailto:Jared.Baker at uwyo.edu>> wrote:
Hello GPFS admins! I hope everybody had a great start to the new year so far.


Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I?ve man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

Again, thank you all!

Jared Baker
ARCC


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/1a382a4d/attachment-0002.htm>

From bruno.silva at crick.ac.uk  Wed Jan  7 14:45:16 2015
From: bruno.silva at crick.ac.uk (Bruno Silva)
Date: Wed, 7 Jan 2015 14:45:16 +0000
Subject: [gpfsug-discuss] Another GPFS + Open Stack user
Message-ID: <D0D2F6FA.703B%bruno.silva@crick.ac.uk>

Hello,

My name is Bruno Silva and I am HPC Lead for the Francis Crick Institute - www.crick.ac.uk

We are participating in eMedlab, an MRC funded collaborative project to support computational biomedical research between UCL, the Crick, Queen Mary University of London, London School of Hygiene and Tropical Medicine.

We are very interested in the integration of GPFS and Open Stack, and particularly how to provide maximum performance securely on VMs.

I believe that Orlando Richards, Simon Thompson, and others have brought this issue to the mailing list and that there will have been useful discussions already on this matter, so we will be very happy to bring any information that might be useful to the discussion.

At the moment we are considering the possibility of automounting GPFS clients on VMs using heat templates and restricting root access to those machines. GPFS would be available through a provider network in Open Stack. I suspect this is a very na?ve approach, but it would be interesting to hear what your thoughts on this idea are.

Many thanks,
Bruno

___________________________________
Dr Bruno Silva
High Performance Computing Lead
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE

T: 020 7611 2117
E: Bruno.Silva at crick.ac.uk
W: www.crick.ac.uk<http://www.crick.ac.uk/>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no.06885462, with its registered office at 215 Euston Road, London NW1 2BE.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150107/550ec4e9/attachment-0002.htm>

From chair at gpfsug.org  Wed Jan 21 16:00:13 2015
From: chair at gpfsug.org (Jez Tucker (Chair))
Date: Wed, 21 Jan 2015 16:00:13 +0000
Subject: [gpfsug-discuss] GPFS UG in 2015
Message-ID: <54BFCD0D.70101@gpfsug.org>

Hello everyone

   The first of a few official posts for 2015.

Updates:

We have an even more improved relationship with the IBM GPFS team in 
2015 than 2014.

IBM have directly engaged with us at the highest level and we're working 
together to be able to bring you better news, features and technical 
insight to GPFS - in addition to organising the next User Group.

With that in mind, here's a few things that will be occurring this year:


*User Group Meeting*

The date for this is in the process of being finalised right now.
It is likely to be April / May.
We will update you as soon as we have a definitive.

*
'Meet the Devs Coffee Shop'

*The IBM devs would love to speak face to face with you all.

So we thought, what better way to do that than in small groups over 
coffee and pizza?
The aim will be to have small meetings with bunches of geographically 
similar users on a regular basis.

The first ('tester') will be _Wednesday 18th Feb_ in London and I'll 
send out a formal invite email shortly.

For those of you further afield, we intend to come to you (yes! Orlando 
in Edinburgh) and hopefully even EMEA / Worldwide if possible.

These small groups will be ideal situations for IBM devs to show you new 
features, what they're working on and solicit feedback from experienced 
GPFS users/admins.

Oh and the Pizza.


*Website revamp*  (yes, another..)

I'm in the process of moving all the content over to Ghost. 
(https://ghost.org/)
We'll flip the site over at some point this week.

Shortly you'll see our first blog contribution from IBM dev.

/What would topics would you like covered? / Let the group know.

Also if anyone (member, dev, tech, etc.) would like to contribute 
technical blog posts regarding GPFS or related softwares/technologies, 
then let me know and we'll set you up an account.


All the best,


Jez (Chair) and Claire (Secretary)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150121/4406e678/attachment-0002.htm>

From chair at gpfsug.org  Fri Jan 23 21:21:43 2015
From: chair at gpfsug.org (Jez Tucker)
Date: Fri, 23 Jan 2015 21:21:43 +0000
Subject: [gpfsug-discuss] User Group speaker for IBM Edge 2015 ?
Message-ID: <54C2BB67.2000806@gpfsug.org>

Hi all

   IBM GPFS Team has asked if anyone from our worldwide membership would 
like to go to IBM Edge and talk about their own experience with GPFS.

Perhaps you've replaced HDFS with GPFS, implemented end-to-end 
GPFS+TSM/LTFS-EE, utilised Hot Files and Local Read Only Cache (v4.1) or 
Flash tiers or you've migrated from SONAS to Elastic?

Or maybe you have large video work-flow, OpenStack SWIFT and Object, or 
have a hybrid GPFS with Elastic/GSS?


If you're interested in participating, please drop Ross Keeping 
(ross.keeping at uk.ibm.com) an email ASAP.

Call for speakers closes on 6th Feb 2015.
Apologies for the short notice.

For further information, please see the Edge 2015 bulletin 
<http://goo.gl/h5RbXi> http://goo.gl/h5RbXi

Jez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150123/c9be4a19/attachment-0002.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 18:37:19 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 18:37:19 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
Message-ID: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Hello GPFS admins! I hope everybody had a great start to the new year so far.

Lately, I've had a few of my users get an error similar to:

      error creating file: no space left on device.


When trying to create even simple files (using Linux `touch` command). However, if they try again in a second or two, the file is created without a problem and they go on about doing their work. I can never tell when they are likely to get the error message about 'no space left on device'. The filesystem creates many files in parallel (depending on the usage of the system and movement of files from other sites)

However, let me first describe our environment a little better. We have a 3 GPFS file systems (home, project, gscratch) on RHEL 6.3 InfiniBand HPC cluster. The version of GPFS is 3.5.0-11. We utilize fileset quotas (on block limits, not file limits) for each file system. Each user has a home fileset for storing basic configuration files, basic notes, and other small files. Each user belongs to a minimum of one project and the quota is shared between the users of the project. The gscratch file system is similar to that of the project file system except that files are deleted after ~9 days.

The partially good news (perhaps) is that the error mentioned above only occurs on the project file system. We've at least not observed the error on the home and gscratch file systems. Here's my initial investigation so far:


1.)Checked the fileset quota on one of the experienced filesets:

--
# mmlsquota -j ModMast project
                         Block Limits                                    |     File Limits
Filesystem type             KB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
project    FILESET   953382016          0 16106127360          0     none |  8666828       0        0        0     none
--

It would seem from the information that the project is indeed well under their quota for their particular project.


2.)Then I checked the overall file system to see if the capacity/inode is nearly full:

--
# mmdf project
disk                disk size  failure holds    holds              free KB             free KB
name                    in KB    group metadata data        in full blocks        in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 397 TB)
U01_L0            15623913472       -1 Yes      Yes      7404335104 ( 47%)     667820032 ( 4%)
U01_L1            15623913472       -1 Yes      Yes      7498215424 ( 48%)     642773120 ( 4%)
U01_L2            15623913472       -1 Yes      Yes      7497969664 ( 48%)     642664576 ( 4%)
U01_L3            15623913472       -1 Yes      Yes      7496232960 ( 48%)     644327936 ( 4%)
U01_L4            15623913472       -1 Yes      Yes      7499296768 ( 48%)     640117376 ( 4%)
U01_L5            15623913472       -1 Yes      Yes      7494881280 ( 48%)     644168320 ( 4%)
U01_L6            15623913472       -1 Yes      Yes      7494164480 ( 48%)     643673216 ( 4%)
U01_L7            15623913472       -1 Yes      Yes      7497433088 ( 48%)     639918976 ( 4%)
U01_L8            15623913472       -1 Yes      Yes      7494139904 ( 48%)     645130240 ( 4%)
U01_L9            15623913472       -1 Yes      Yes      7498375168 ( 48%)     639979520 ( 4%)
U01_L10           15623913472       -1 Yes      Yes      7496028160 ( 48%)     641909632 ( 4%)
U01_L11           15623913472       -1 Yes      Yes      7496093696 ( 48%)     643749504 ( 4%)
U01_L12           15623913472       -1 Yes      Yes      7496425472 ( 48%)     641556992 ( 4%)
U01_L13           15623913472       -1 Yes      Yes      7495516160 ( 48%)     643395840 ( 4%)
U01_L14           15623913472       -1 Yes      Yes      7496908800 ( 48%)     642418816 ( 4%)
U01_L15           15623913472       -1 Yes      Yes      7495823360 ( 48%)     643580416 ( 4%)
U01_L16           15623913472       -1 Yes      Yes      7499939840 ( 48%)     641538688 ( 4%)
U01_L17           15623913472       -1 Yes      Yes      7497355264 ( 48%)     642184704 ( 4%)
U13_L0             2339553280       -1 Yes      No       2322395136 ( 99%)       8190848 ( 0%)
U13_L1             2339553280       -1 Yes      No       2322411520 ( 99%)       8189312 ( 0%)
U13_L12           15623921664       -1 Yes      Yes      7799422976 ( 50%)     335150208 ( 2%)
U13_L13           15623921664       -1 Yes      Yes      8002662400 ( 51%)     126059264 ( 1%)
U13_L14           15623921664       -1 Yes      Yes      8001093632 ( 51%)     126107648 ( 1%)
U13_L15           15623921664       -1 Yes      Yes      8001732608 ( 51%)     126167168 ( 1%)
U13_L16           15623921664       -1 Yes      Yes      8000077824 ( 51%)     126240768 ( 1%)
U13_L17           15623921664       -1 Yes      Yes      8001458176 ( 51%)     126068480 ( 1%)
U13_L18           15623921664       -1 Yes      Yes      7998636032 ( 51%)     127111680 ( 1%)
U13_L19           15623921664       -1 Yes      Yes      8001892352 ( 51%)     125148928 ( 1%)
U13_L20           15623921664       -1 Yes      Yes      8001916928 ( 51%)     126187904 ( 1%)
U13_L21           15623921664       -1 Yes      Yes      8002568192 ( 51%)     126591616 ( 1%)
                -------------                         -------------------- -------------------
(pool total)     442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

                =============                         ==================== ===================
(data)           437469659136                          214660595712 ( 49%)   13061741568 ( 3%)
(metadata)       442148765696                          219305402368 ( 50%)   13078121728 ( 3%)
                =============                         ==================== ===================
(total)          442148765696                          219305402368 ( 50%)   13078121728 ( 3%)

Inode Information
-----------------
Number of used inodes:       133031523
Number of free inodes:         1186205
Number of allocated inodes:  134217728
Maximum number of inodes:    134217728
--

Eureka! From here it seems that the inode capacity is teetering on its limit. I think at this point it would be best to educate our users on not writing millions of small text files as I don't think it is possible to adjust the GPFS block size to something lower (block size is currently 4MB). The system was originally targeted at large read/writes from traditional HPC users, but we have now diversified our user base to include other computing areas outside traditional HPC. Documentation states that if parallel writes are to be done, that a minimum of 5% of the inodes need to be free otherwise performance will suffer. From above, we have less than 1% free which I think is the root of our problem.

Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I've man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

The man page describes that the limit is:

      max_files = total_filesystem_space / (inode_size + subblock_size)

and the subblock size defined from IBM's website as 1/32 of the block size (which is 4MB). Therefore, I calculate that the maximum number of inodes I could potentially have is:

      3440846425

Which is approximately 25x the current maximum, so I think there is reason that I can increase the inode count without too much worry. Are there any caveats to my logic here? I'm not saying I'll increase it to the maximum value right away because the inode space would take away from some usable capacity of the system.

Thanks for any comments and recommendations. I have a grand size maintenance period coming up due to datacenter power upgrades and I'll be given ~2 weeks of down time for maintenance which I'm trying to get all my ducks in line and if I need to do something time consuming with the file systems, I'd like to know ahead of time so I can do it during the maintenance window as I will probably not get another one window for many months after.

Again, thank you all!

Jared Baker
ARCC


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/ede3f500/attachment-0003.htm>

From oester at gmail.com  Fri Jan  2 18:52:50 2015
From: oester at gmail.com (Bob Oesterlin)
Date: Fri, 2 Jan 2015 12:52:50 -0600
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
Message-ID: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>

I increase the inode limit on my file systems on a regular basis with no
problems. I do have my data/metadata split into separate pools, but the
procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu>
wrote:

>  Hello GPFS admins! I hope everybody had a great start to the new year so
> far.
>
>
>
>
>
> Therefore, is there a method to safely increase the maximum inode count
> and could it be done during operation or should the system be unmounted?
> I?ve man paged / searched online and found a few hints suggesting below but
> was curious about its safety during operation:
>
>
>
>       mmchfs project --inode-limit <new_max_inode_count>
>
>
>
> Again, thank you all!
>
>
>
> Jared Baker
>
> ARCC
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/499801fa/attachment-0003.htm>

From Jared.Baker at uwyo.edu  Fri Jan  2 21:06:30 2015
From: Jared.Baker at uwyo.edu (Jared David Baker)
Date: Fri, 2 Jan 2015 21:06:30 +0000
Subject: [gpfsug-discuss] Question about changing inode capacity safely
In-Reply-To: <CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
References: <CY1PR0501MB12596C5B80273739B81233439D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>
	<CAMNdFvB0zPQbX5acth=yO+Jfbt3nEhEo3Hj9_rtKBK04CNxkuQ@mail.gmail.com>
Message-ID: <CY1PR0501MB125993939F7A0C37DA4172559D5D0@CY1PR0501MB1259.namprd05.prod.outlook.com>

Bob, thanks for the quick reply! I had a feeling is was a fairly safe operation. Being that I joined the group post file system decision, we?ve had many discussions about different storage pools for data/metadata but mostly concluded that we would worry about it more critically on the next system at this point. Again, thanks.

Jared Baker
ARCC

From: Bob Oesterlin [mailto:oester at gmail.com]
Sent: Friday, January 02, 2015 11:53 AM
To: gpfsug main discussion list
Cc: Jared David Baker
Subject: Re: [gpfsug-discuss] Question about changing inode capacity safely

I increase the inode limit on my file systems on a regular basis with no problems. I do have my data/metadata split into separate pools, but the procedure is the same.


Bob Oesterlin
Nuance Communications


On Fri, Jan 2, 2015 at 12:37 PM, Jared David Baker <Jared.Baker at uwyo.edu<mailto:Jared.Baker at uwyo.edu>> wrote:
Hello GPFS admins! I hope everybody had a great start to the new year so far.


Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I?ve man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:

      mmchfs project --inode-limit <new_max_inode_count>

Again, thank you all!

Jared Baker
ARCC


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/1a382a4d/attachment-0003.htm>

From bruno.silva at crick.ac.uk  Wed Jan  7 14:45:16 2015
From: bruno.silva at crick.ac.uk (Bruno Silva)
Date: Wed, 7 Jan 2015 14:45:16 +0000
Subject: [gpfsug-discuss] Another GPFS + Open Stack user
Message-ID: <D0D2F6FA.703B%bruno.silva@crick.ac.uk>

Hello,

My name is Bruno Silva and I am HPC Lead for the Francis Crick Institute - www.crick.ac.uk

We are participating in eMedlab, an MRC funded collaborative project to support computational biomedical research between UCL, the Crick, Queen Mary University of London, London School of Hygiene and Tropical Medicine.

We are very interested in the integration of GPFS and Open Stack, and particularly how to provide maximum performance securely on VMs.

I believe that Orlando Richards, Simon Thompson, and others have brought this issue to the mailing list and that there will have been useful discussions already on this matter, so we will be very happy to bring any information that might be useful to the discussion.

At the moment we are considering the possibility of automounting GPFS clients on VMs using heat templates and restricting root access to those machines. GPFS would be available through a provider network in Open Stack. I suspect this is a very na?ve approach, but it would be interesting to hear what your thoughts on this idea are.

Many thanks,
Bruno

___________________________________
Dr Bruno Silva
High Performance Computing Lead
The Francis Crick Institute
Gibbs Building
215 Euston Road
London NW1 2BE

T: 020 7611 2117
E: Bruno.Silva at crick.ac.uk
W: www.crick.ac.uk<http://www.crick.ac.uk/>

The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no.06885462, with its registered office at 215 Euston Road, London NW1 2BE.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150107/550ec4e9/attachment-0003.htm>

From chair at gpfsug.org  Wed Jan 21 16:00:13 2015
From: chair at gpfsug.org (Jez Tucker (Chair))
Date: Wed, 21 Jan 2015 16:00:13 +0000
Subject: [gpfsug-discuss] GPFS UG in 2015
Message-ID: <54BFCD0D.70101@gpfsug.org>

Hello everyone

   The first of a few official posts for 2015.

Updates:

We have an even more improved relationship with the IBM GPFS team in 
2015 than 2014.

IBM have directly engaged with us at the highest level and we're working 
together to be able to bring you better news, features and technical 
insight to GPFS - in addition to organising the next User Group.

With that in mind, here's a few things that will be occurring this year:


*User Group Meeting*

The date for this is in the process of being finalised right now.
It is likely to be April / May.
We will update you as soon as we have a definitive.

*
'Meet the Devs Coffee Shop'

*The IBM devs would love to speak face to face with you all.

So we thought, what better way to do that than in small groups over 
coffee and pizza?
The aim will be to have small meetings with bunches of geographically 
similar users on a regular basis.

The first ('tester') will be _Wednesday 18th Feb_ in London and I'll 
send out a formal invite email shortly.

For those of you further afield, we intend to come to you (yes! Orlando 
in Edinburgh) and hopefully even EMEA / Worldwide if possible.

These small groups will be ideal situations for IBM devs to show you new 
features, what they're working on and solicit feedback from experienced 
GPFS users/admins.

Oh and the Pizza.


*Website revamp*  (yes, another..)

I'm in the process of moving all the content over to Ghost. 
(https://ghost.org/)
We'll flip the site over at some point this week.

Shortly you'll see our first blog contribution from IBM dev.

/What would topics would you like covered? / Let the group know.

Also if anyone (member, dev, tech, etc.) would like to contribute 
technical blog posts regarding GPFS or related softwares/technologies, 
then let me know and we'll set you up an account.


All the best,


Jez (Chair) and Claire (Secretary)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150121/4406e678/attachment-0003.htm>

From chair at gpfsug.org  Fri Jan 23 21:21:43 2015
From: chair at gpfsug.org (Jez Tucker)
Date: Fri, 23 Jan 2015 21:21:43 +0000
Subject: [gpfsug-discuss] User Group speaker for IBM Edge 2015 ?
Message-ID: <54C2BB67.2000806@gpfsug.org>

Hi all

   IBM GPFS Team has asked if anyone from our worldwide membership would 
like to go to IBM Edge and talk about their own experience with GPFS.

Perhaps you've replaced HDFS with GPFS, implemented end-to-end 
GPFS+TSM/LTFS-EE, utilised Hot Files and Local Read Only Cache (v4.1) or 
Flash tiers or you've migrated from SONAS to Elastic?

Or maybe you have large video work-flow, OpenStack SWIFT and Object, or 
have a hybrid GPFS with Elastic/GSS?


If you're interested in participating, please drop Ross Keeping 
(ross.keeping at uk.ibm.com) an email ASAP.

Call for speakers closes on 6th Feb 2015.
Apologies for the short notice.

For further information, please see the Edge 2015 bulletin 
<http://goo.gl/h5RbXi> http://goo.gl/h5RbXi

Jez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150123/c9be4a19/attachment-0003.htm>