[gpfsug-discuss] Quotas and AFM
Venkateswara R Puvvada
vpuvvada at in.ibm.com
Mon Oct 14 07:29:05 BST 2019
As Simon already mentioned, set the similar quotas at both cache and home
clusters to avoid the queue stuck problem due to quotas being exceeds
home.
>At home we had replication of two so it wasn't straight forward to set
the same quotas on cache, we could just about fudge it for user home
directories but not for most of our project storage as we use dependent
fileaet >quotas.
AFM will support dependent filesets from 5.0.4. Dependent filesets can be
created at the cache in the independent fileset and set the same quotas
from the home
>We also saw issues with data in inode at home as this doesn't work at AFM
cache so it goes into a block. I've forgotten the exact issues around that
now.
AFM uses some inode space to store the remote file attributes like file
handle, file times etc.. as part of the EAs. If the file does not have
hard links, maximum inode space used by the AFM is around 200 bytes. AFM
cache can store the file's data in the inode if it have 200 bytes of more
free space in the inode, otherwise file's data will be stored in subblock
rather than using the full block. For example if the inode size is 4K at
both cache and home, if the home file size is 3k and inode is using 300
bytes to store the file metadata, then free space in the inode at the home
will be 724 bytes(4096 - (3072 + 300)). When this file is cached by the
AFM , AFM adds internal EAs for 200 bytes, then the free space in the
inode at the cache will be 524 bytes(4096 - (3072 + 300 + 200)). If the
filesize is 3600 bytes at the home, AFM cannot store the data in the inode
at the cache. So AFM stores the file data in the block only if it does not
have enough space to store the internal EAs.
~Venkat (vpuvvada at in.ibm.com)
From: Simon Thompson <S.J.Thompson at bham.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 10/12/2019 01:52 AM
Subject: [EXTERNAL] Re: [gpfsug-discuss] Quotas and AFM
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Oh and I forgot. This only works if you precache th data from home.
Otherwise the disk usage at cache is only what you cached, as you don't
know what size it is from home.
Unless something has changed recently at any rate.
Simon
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Simon Thompson
<S.J.Thompson at bham.ac.uk>
Sent: Friday, October 11, 2019 9:10:20 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Quotas and AFM
Yes just set the quotas the same on both. Or a default quota and have
exceptions if that works in your case. But this was where I think the
inode in file is an issue if you have a lot of small files as in the inode
at home they don't consume quota I think but as they are in a data block
at cache they do. So it might now be quite so straightforward.
And yes writes at home just get out of space, it's the AFM cache that
fails on the write back to home but then its in the queue and can block
it.
Simon
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Ryan Novosielski
<novosirj at rutgers.edu>
Sent: Friday, October 11, 2019 9:05:15 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Quotas and AFM
Do you know is there anything that prevents me from just setting the
quotas the same on the IW cache, if there’s no way to inherit? For the
case of the home directories, it’s simple, as they are all 100G with some
exceptions, so a default user quota takes care of almost all of it.
Luckily, that’s right now where our problem is, but we have the potential
with other filesets later.
I’m also wondering if you can confirm that I should /not/ need to be
looking at people who are writing to the at home fileset, where the quotas
are set, as a problem syncing TO the cache, e.g. they don’t add to the
queue. I assume GPFS sees the over quota and just denies the write, yes? I
originally thought the problem was in that direction and was totally
perplexed about how it could be so stupid. 😅
--
____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS
Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630,
Newark
`'
On Oct 11, 2019, at 15:56, Simon Thompson <S.J.Thompson at bham.ac.uk> wrote:
Yes.
When we ran AFM, we had exactly this issue. What would happen is that a
user/fileset quota would be hit and a compute job would continue writing.
This would eventually fill the AFM queue. If you were lucky you could stop
and restart the queue and it would process other files from other users
but inevitably we'd get back to the same state. The solution was to
increase the quota at home to clear the queue, kill user workload and then
reduce their quota again.
At home we had replication of two so it wasn't straight forward to set the
same quotas on cache, we could just about fudge it for user home
directories but not for most of our project storage as we use dependent
fileaet quotas.
We also saw issues with data in inode at home as this doesn't work at AFM
cache so it goes into a block. I've forgotten the exact issues around that
now.
So our experience was much like you describe.
Simon
From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Ryan
Novosielski <novosirj at rutgers.edu>
Sent: Friday, 11 October 2019, 18:43
To: gpfsug main discussion list
Subject: [gpfsug-discuss] Quotas and AFM
Does anyone have any good resources or experience with quotas and AFM
caches? Our scenario is that we have an AFM home one one site, an AFM
cache on another site, and then a client cluster on that remote site that
mounts the cache. The AFM filesets are IW. One of them contains our home
directories, which have a quota set on the home side. Quotas were disabled
entirely on the cache side (I enabled them recently, but did not set them
to anything). What I believe we’re running into is scary long AFM queues
that are caused by people writing an amount that is over the home quota to
the cache, but the cache is accepting it and then failing to sync back to
the home because the user is at their hard limit. I believe we’re also
seeing delays on unaffected users who are not over their quota, but that’s
harder to tell. We have the AFM gateways poorly/not tuned, so that is
likely interacting. Is there any way to make the quotas apparent to the
cache cluster too, beyond setting a quota there as well, or do I just
fundamentally misunderstand this in some other way? We really just want
the quotas on the home cluster to be enforced everywhere, more or less.
Thanks! -- ____ || \\UTGERS,
|---------------------------*O*--------------------------- ||_// the State
| Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr.
Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office
of Advanced Research Computing - MSB C630, Newark `'
_______________________________________________ gpfsug-discuss mailing
list gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=FQMV8_Ivetm1R6_TcCWroPT58pjhPJgL39pgOdQEiqw&s=DfvksQLrKgv0OpK3Dr5pR-FUkhNddIvieh9_8h1jyGQ&e=
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20191014/645871f8/attachment.htm>
More information about the gpfsug-discuss
mailing list