[gpfsug-discuss] Blog on "Automatic tiering of object data based on object heat"

Jonathan Buzzard jonathan at buzzard.me.uk
Wed Jun 22 13:21:54 BST 2016


On Wed, 2016-06-22 at 10:39 +0000, Daniel Kidger wrote:
> Does anyone in the field have much experience with using file heat for
> migration, whether for object or more generically? In particular using
> policies to move files both ways dependant on recent usage patterns.
> 

In my experience moving files from slow to fast storage (I am ignoring a
tape based or other really slow layer here) was generally a waste of
time for three reasons.

Firstly by the time you notice that the file is in use again 99% of the
time user has stopped using it anyway so there is no advantage to moving
it back to faster storage. You would need to scan the file system more
than once a day to capture more reused files in my experience.

Secondly if the data is modified and saved due to the way most software
handles this you end up with a "new" file which will land in the fast
storage anyway. That is most software on a save creates a new temporary
file first with the new version of the file, renames the old file to
something temporary, renames the new file to the right file name and
then finally deletes the old version of file. That way if the program
crashes somewhere in the save there will always be a good version of the
file to go back to.

Thirdly the slower tier (I am thinking along the lines of large RAID6 or
equivalent devices) generally has lots of spare IOPS capacity anyway.
Consequently the small amount or "revisited" data that does remain in
use beyond a day is not going to have a significant impact. I measured
around 10 IOPS on average on the slow disk in a general purpose file
system. The only time it broke this was when a flush of data form the
faster tier arrived at which point they would peg out at around 120 IOPS
which is what you would expect for a 7200 RPM disk.

> 
> And also if you ever move files to colder storage without necessarily
> waiting until the files are say over 7 days old, since you know they
> are not going to be used for a while.
> 

I would do things like punt .iso images directly to slower storage.
Those sorts of files generally don't benefit from having low seek times
which is what your fast disk pools give you. I would suggest that video
files like MPEG and MP4 could be potentially also be treated similarly.
So a rule like the following

/* force ISO images onto nearline storage  */
RULE 'iso' SET POOL 'slow' WHERE LOWER(NAME) LIKE '%.iso'

What you might punt to slower storage will likely depend heavily on what
you file system is being used for.

You can of course also use this to "discourage" your storage from being
used for "personal" use by giving certain file types lower performance.
So a rule like the following puts all the music files on slower storage.

/* force MP3's and the like onto nearline storage forever */
RULE 'mp3' SET POOL 'slow'
    WHERE LOWER(NAME) LIKE '%.mp3' OR LOWER(NAME) LIKE '%.m4a' OR
LOWER(NAME) LIKE '%.wma'


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.





More information about the gpfsug-discuss mailing list