[gpfsug-discuss] File_heat for GPFS File Systems

Marc A Kaplan makaplan at us.ibm.com
Mon Sep 26 16:11:52 BST 2016


fileHeatLossPercent=10, fileHeatPeriodMinutes=1440

means any file that has not been accessed for 1440 minutes (24 hours = 1 
day) will lose 10% of its Heat.

So if it's heat was X at noon today, tomorrow  0.90 X, the next day 0.81X, 
 on the k'th day   (.90)**k * X.
After 63 fileHeatPeriods, we always round down and compute file heat as 
0.0.

The computation (in floating point with some approximations) is done "on 
demand" based on a heat value stored in the Inode the last time the unix 
access "atime"  and the current time.  So the cost of maintaining 
FILE_HEAT for a file is some bit twiddling, but only when the file is 
accessed and the atime would be updated in the inode anyway.

File heat increases by approximately 1.0 each time the entire file is read 
from disk.   This is done proportionately so if you read in half of the 
blocks the increase is 0.5.
If you read all the blocks twice FROM DISK the file heat is increased by 
2. And so on.  But only IOPs are charged.  If you repeatedly do posix 
read()s but the data is in cache, no heat is added.


The easiest way to observe FILE_HEAT is with the mmapplypolicy directory 
-I test -L 2 -P fileheatrule.policy

RULE 'fileheatrule' LIST 'hot' SHOW('Heat=' || varchar(FILE_HEAT))  /* in 
file fileheatfule.policy */

Because policy reads metadata from inodes as stored on disk, when 
experimenting/testing you may need to 

mmfsctl fs suspend-write;  mmfsctl fs resume

to see results immediately.





From:   Andreas Landhäußer <alandhae at gmx.de>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/26/2016 08:12 AM
Subject:        [gpfsug-discuss] File_heat for GPFS File Systems Questions 
over    Questions ...
Sent by:        gpfsug-discuss-bounces at spectrumscale.org




Hello GPFS experts,

customer wanting a report about the usage of the usage including file_heat 

in a large Filesystem. The report should be taken every month.

mmchconfig fileHeatLossPercent=10,fileHeatPeriodMinutes=30240 -i

fileHeatPeriodMinutes=30240 equals to 21 days.
I#m wondering about the behavior of fileHeatLossPercent.

- If it is set to 10, will file_heat decrease from 1 to 0 in 10 steps?
- Or does file_heat have an asymptotic behavior, and heat 0 will never be 
reached?

Anyways the results will be similar ;-) latter taking longer.

We want to achieve following file lists:

- File_Heat > 50% -> rather hot data
- File_Heat 50% < x < 20 -> lukewarm data
- File_Heat 20% <= x <= 0% -> ice cold data

We will have to work on the limits between the File_Heat classes, 
depending on customers wishes.

Are there better parameter settings for achieving this?

Do any scripts/programs exist for analyzing the file_heat data?

We have observed when taking policy runs on a large GPFS file system, the 
meta data performance significantly dropped, until job was finished.
It took about 15 minutes on a 880 TB with 150 Mio entries GPFS file 
system.

How is the behavior, when file_heat is being switched on?
Do all files in the GPFS have the same temperature?

Thanks for your help

                 Ciao

                 Andreas

-- 
Andreas Landhäußer  +49 151 12133027 (mobile)
alandhae at gmx.de
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160926/4e69c8b7/attachment-0002.htm>


More information about the gpfsug-discuss mailing list