[gpfsug-discuss] GPFS, LTFS/EE and data-in-inode?

John Hearns john.hearns at asml.com
Tue Jul 25 10:30:28 BST 2017


I agree with Jonathan.
In my experience, if you look at why there are many small files being stored by researchers, these are either the results of data acquisition - high speed cameras, microscopes, or in my experience a wind tunnel. Or the images are a sequence of images produced by a simulation which are later post-processed into a movie or Ensight/Paraview format. When questioned, the resaechers will always say "but I would like to keep this data available just in case". In reality those files are never looked at again. And as has been said if you have a tape based archiving system you could end up with thousands of small files being spread all over your tapes.  So it is legitimate to make zips / tars of directories like that.

I am intrigued to see that GPFS has a policy facility which can call an external program. That is useful.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard
Sent: Tuesday, July 25, 2017 11:02 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS, LTFS/EE and data-in-inode?

On Mon, 2017-07-24 at 11:49 -0400, valdis.kletnieks at vt.edu wrote:
> On Mon, 24 Jul 2017 12:43:10 +0100, Jonathan Buzzard said:
>
> > For an archive service how about only accepting files in actual
> > "archive" formats and then severely restricting the number of files
> > a user can have?
> >
> > By archive files I am thinking like a .zip, tar.gz, tar.bz or similar.
>
> After having dealt with users who fill up disk storage for almost 4
> decades now, I'm fully aware of those advantages. :)
>
> ( /me ponders when an IBM 2314 disk pack with 27M of space was "a lot"
> in 1978, and when we moved 2 IBM mainframes in 1989, 400G took 2,500+
> square feet, and now 8T drives are all over the place...)
>
> On the flip side, my current project is migrating 5 petabytes of data
> from our old archive system that didn't have such rules (mostly due to
> politics and the fact that the underlying XFS filesystem uses a 4K
> blocksize so it wasn't as big an issue), so I'm stuck with what people put in there years ago.

I would be tempted to zip up the directories and move them ziped ;-)

JAB.

--
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce8a4016223414177bf9408d4d33bdb31%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pean0PRBgJJmtbZ7TwO%2BxiSvhKsba%2FRGI9VUCxhp6kM%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.



More information about the gpfsug-discuss mailing list