[gpfsug-discuss] Policy scan against billion files for ILM/HSM
Marc A Kaplan
makaplan at us.ibm.com
Tue Apr 11 16:36:47 BST 2017
As primary developer of mmapplypolicy, please allow me to comment:
1) Fast access to metadata in system pool is most important, as several
have commented on. These days SSD is the favorite, but you can still go
with "spinning" media.
If you do go with disks, it's extremely important to spread your metadata
over independent disk "arms" -- so you can have many concurrent seeks in
progress at the same time. IOW, if there is a virtualization/mapping
layer, watchout that your logical disks don't get mapped to the same
physical disk.
2) Crucial to use both -g and -N :: -g
/gpfs-not-necessarily-the-same-fs-as-Im-scanning/tempdir and -N
several-nodes-that-will-be-accessing-the-system-pool
3a) If at all possible, encourage your data and application designers to
"pack" their directories with lots of files. Keep in mind that,
mmapplypolicy will read every directory. The more directories, the more
seeks, more time spent waiting for IO. OTOH, in more typical Unix/Linux
usage, we tend to low average number of files per directory.
3b) As admin, you may not be able to change your data design to pack
hundreds of files per directory, BUT you can make sure you are running a
sufficiently modern release of Spectrum Scale that supports "data in
inode" -- "Data in inode" also means "directory entries in inode" -- which
means practically any small directory, up to a few hundred files, will fit
in an an inode -- which means mmapplypolicy can read small directories
with one seek, instead of two.
(Someone will please remind us of the release number that first supported
"directories in inode".)
4) Sorry, Fred, but the recommendation to use RAID mirroring of metadata
on SSD, is not necessarily, important for metadata scanning. In fact it
may work against you. If you use GPFS replication of metadata - that can
work for you -- since then GPFS can direct read operations to either copy,
preferring a locally attached copy, depending on how storage is attached
to node, etc, etc. Choice of how to replicate metadata - either using
GPFS replication or the RAID controller - is probably best made based on
reliability and recoverability requirements.
5) YMMV - We'd love to hear/see your performance results for
mmapplypolicy, especially if they're good. Even if they're bad, come back
here for more tuning tips!
-- marc of Spectrum Scale (ne GPFS)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170411/264bb85a/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list