[gpfsug-discuss] Capacity pool filling

Fri Jun 8 09:23:18 BST 2018

Hi Kevin, 

gpfsug-discuss-bounces at spectrumscale.org wrote on 07/06/2018 21:36:59:

> From: "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>

> So our restore software lays down the metadata first, then the data.
> While it has no specific knowledge of the extended attributes, it 
> does back them up and restore them.  So the only explanation that 
> makes sense to me is that since the inode for the file says that the
> file should be in the gpfs23capacity pool, the data gets written there.
Hm, fair enough. So it seems to extract and revise information from the 
inodes of backed-up files (since it cannot reuse the inode number, it must 
do so ...). 
Then, you could ask your SW vendor to include a functionality like 
"restore using GPFS placement" (ignoring pool info from inode), "restore 
data to pool XY" (all data restored,, but all to pool XY) or "restore only 
data from pool XY" (only data originally backed up from XY, and restored 
to XY), and LBNL "restore only data from pool XY to pool ZZ". The tapes 
could still do streaming reads, but all files not matching the condition 
would be ignored. Is a bit more sophisticated than just copying the inode 
content except some fields such as inode number. OTOH, how often are 
restores really needed ... so it might be over the top ...
> 
> We?ve also determined that it is actually quicker to tell 
> our tape system to restore all files from a fileset than to take the
> time to tell it to selectively restore only certain files ? and the 
> same amount of tape would have to be read in either case.
Given that you know where the restored files are going to in the file 
system, you can also craft a policy that deletes all files which are in 
pool Capacity and have a path into the restore area. Running that every 
hour should keep your capacity pool from filling up. Just the tapes need 
to read more, but because they do it in streaming mode, it is probably not 
more expensive than shoe-shining. And that could also be applied to the 
third data pool which also retrieves files.
But maybe your script is also sufficient

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke

IT Specialist
High Performance Computing Services / Integrated Technology Services / 
Data Center Services
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefalke at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
Thomas Wolter, Sven Schooß
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 17122