[gpfsug-discuss] GPFS de duplication

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Fri May 21 09:57:23 BST 2021

On 20/05/2021 13:58, Dave Bond wrote:

> As part of a project I am doing I am looking if there are any de 
> duplication options for GPFS?  I see there is no native de dupe for the 
> filesystem. The scenario would be user A creates a file or folder and 
> user B takes a copy within the same filesystem, though separate 
> independent filesets.  The intention would be to store 1 copy.    So I 
> was wondering ....
> 1) Is this is planned to be implemented into GPFS in the future?
> 2) Is anyone is using any other solutions that have good GPFS integration?

Disk space in 2021 is insanely cheap. With an ESS/DSS you can get many 
PB in a single rack. The complexity that dedup introduces is simple not 
worth it IMHO.

Or put another way there is better things the developers at IBM can be 
working on than dedup code.

Historically if you crunched the numbers the licensing for dedup on 
NetApp was similar to just buying more disk unless you where storing 
hundreds of copies of the same data. About the only use case scenario 
would be storing lots of virtual machines. However I refer you back to 
my original point :-)


Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG

More information about the gpfsug-discuss mailing list