[gpfsug-discuss] [EXTERNAL] mmbackup vs SOBAR

Alec anacreo at gmail.com
Fri Apr 28 11:41:28 BST 2023

I don't know how in-context this is... but we do our backups using
NetBackup to Data Domain using Accelerated Backups...

My chargeback is through the roof because they think we backup all 600TB of
data in about 10 hours, which is a synthetic backup speed of ~1TB/min.  Our
actual data throughput rate on a single 10G channel is about 1.7GB/s, our
backup reduction on the Data Domain is about 168:1.  Which is phenomenal
since our primary source database is about 18TB, our stats against that DB
is about 600TB of data and our 35 days of backups on the Data Domain takes
about 17TB of space.  Our data is Statistical files and we have a mix of gz
and non-gz files and the Data Domain seems to dedupe and compress either
format fine to the original uncompressed bytes, eg. we gzipped about 30TB
overnight and our backup footprint didn't grow at all... (gotta love

Our backup performance and our primary disk performance using Spectrum
Scale is pretty much just wire speed and because we can scan millions of
files a minute, and we're able to GZip compress our old data automatically
using a simple GPFS policy, we just haven't bothered creating an "archive"
solution which can create its own headaches, essentially we archive our
data in-place.

Honestly this configuration only has a few weak points... Since we have a
single file system we have to manually break the backups up over multiple
policies by Fileset to achieve an optimal amount of parallelism.  A full
data recovery could take more time than desired.  We replicate our backups
and our primary disk, but if we failover using our disk image replication
we must do a full backup from scratch, this can take 2 weeks to achieve a
proper steady state again.  We could do something to provide relief such as
periodically backing up from the BCP so the backups don't fall off (but
we'd have to pause replication during this).  Or we could have used
GPFS's native data replication and have active/active sites and just backup
from both sites.

My favorite part of the solution... it uses a commodity backup which is
completely off disk... if I needed to restore to any other location I
could, and I am able to rely on the backup team to fully support the
solution without any unique costs or configuration (except the multiple
policies)... yet we're the fastest and largest backup client in the
configuration by miles on a "commodity" solution.

Anyhow, just my two cents about how we've achieved success with the
wonderful Spectrum Scale FS and our standard NetBackup / Data Domain backup

Please reach out to me privately if anyone wants more details about this
configuration, I'll share what is permissible..


On Tue, Apr 25, 2023 at 4:58 AM Paul Ward <p.ward at nhm.ac.uk> wrote:

> Hi Peter and Robert,
> Sorry for the delayed reply, only occasionally check the mailing list.
> I'm happy to have a MS Teams (or other platform) call about the setup you
> are talking about, as we've just decommission that kind of environment.
> Spectrum SCALE
> Spectrum Protect with HSM using a dual robot High density tape library
> with off site copy
> SOBAR - (in theory) implemented.
> Kindest regards,
> Paul
> Paul Ward
> TS Infrastructure Architect
> Natural History Museum
> T: 02079426450
> E: p.ward at nhm.ac.uk
> -----Original Message-----
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of
> Peter Childs
> Sent: Thursday, March 9, 2023 4:08 PM
> To: gpfsug-discuss at gpfsug.org
> Subject: Re: [gpfsug-discuss] [EXTERNAL] mmbackup vs SOBAR
> I've been told "you really should be using SOBAR" a few times, but never
> really understood how to do so and the steps involved. I feel sure it
> should have some kind of white paper, so far I've been thinking to setup
> some kind of test system, but get a little lost on where to start (and lack
> of time).
> We currently use mmbackup to x2 servers using `--tsm-servers TSMServer1,
> TSMServer2` to have two independant backups and this works nicely until you
> lose a tape when restoring that tape is going to be a nightmare. (read
> rebuild the whole shadow database)
> We started with a copy pool until we filled our tape library up, and then
> swapped to Protect replication until we found this really did not work very
> well (really slow and missing files), and IBM surgested we use mmbackup
> with 2 servers and have two independ backups, which is working very well
> for us now.
> I think if I was going to implement SOBAR I'd want to run mmbackup as well
> as SOBAR will not give you point in time recovery or partial recovery and
> is really only a disarster solution. I'd also probably want 3 copies on
> tape, 1 in SOBAR, and 2x via mmbackup via two backups or via a copy pool
> I'm currently thinking to play with HSM and SOBAR on a test system, but
> have not started yet...... Maybe a talk at the next UG would be helpful on
> backups, I'm not sure if I want to do one, or if we can find an "expert"
> Peter Childs
> ________________________________________
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of
> Robert Horton <robert.horton at icr.ac.uk>
> Sent: Thursday, March 9, 2023 3:44 PM
> To: gpfsug-discuss at gpfsug.org
> Subject: [EXTERNAL] [gpfsug-discuss] mmbackup vs SOBAR
> CAUTION: This email originated from outside of QMUL. Do not click links or
> open attachments unless you recognise the sender and know the content is
> safe.
> Hi Folks,
> I'm setting up a filesystem for "archive" data which will be aggressively
> tiered to tape using the Spectrum Protect (or whatever it's called today)
> Space Management. I would like to have two copies on tape for a) reading
> back the data on demand b) recovering accidentally deleted files etc c)
> disaster recovery of the whole filesystem if necessary.
> My understanding is:
>   1.  Backup and Migration are completely separate things to Spectrum
> Protect. You can't "restore" from a migrated file nor do a DMAPI read from
> a backup.
>   2.  A SOBAR backup would enable the namespace to be restored if the
> filesystem were lost but needs all files to be (pre-)migrated and needs the
> filesystem blocksize etc to match.
>   3.  A SOBAR backup isn't much help for restoring individual (deleted)
> files. There is a dsmmigundelete utility that restores individual stubs but
> doesn't restore directories etc so you really want a separate backup.
> My thinking is to do backups to one (non-replicated) tape pool and migrate
> to another and run mmimgbackup regularly. I'd then have a path to do a full
> restore if either set of tapes were lost although it seems rather messy and
> it's a bit of a pain that SP needs to read everything twice.
> So... have I understood that correctly and does anyone have any better /
> alternative suggestions?
> Thanks,
> Rob
> Robert Horton | Scientific Computing Infrastructure Lead The Institute of
> Cancer Research | 237 Fulham Road, London, SW3 6JB T +44 (0) 20 7153 5350 |
> E robert.horton at icr.ac.uk<mailto:robert.horton at icr.ac.uk> | W
> http://www.icr.ac.uk/<http://www.icr.ac.uk/> | Twitter @ICR_London<
> https://twitter.com/ICR_London>
> Facebook http://www.facebook.com/theinstituteofcancerresearch<
> http://www.facebook.com/theinstituteofcancerresearch>
> Making the discoveries that defeat cancer [ICR Logo]<http://www.icr.ac.uk/
> >
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No.
> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
> This e-mail message is confidential and for use by the addressee only. If
> the message is received by anyone other than the addressee, please return
> the message to the sender by replying to it and then delete the message
> from your computer and network.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230428/d9275fef/attachment.htm>

More information about the gpfsug-discuss mailing list