[gpfsug-discuss] GPFS Evaluation List - Please give some comments

Jez Tucker Jez.Tucker at rushes.co.uk
Thu May 31 09:32:24 BST 2012


Hello Grace,

  I've cribbed out the questions you've already answered.
Though, I think these should be best directed to IBM pre-sales tech to qualify them.

Regards,

Jez

> -----Original Message-----
> From: gpfsug-discuss-bounces at gpfsug.org [mailto:gpfsug-discuss-
> bounces at gpfsug.org] On Behalf Of Grace Tsai
> Sent: 30 May 2012 20:16
> To: gpfsug-discuss at gpfsug.org
> Subject: [gpfsug-discuss] GPFS Evaluation List - Please give some comments
> 
> Hi,
> 
> We  are in the process of choosing a permanent file system for our
> institution, GPFS is one of the three candidates. Could someone help me to
> give comments or answers to the requests listed in the following. Basically,
> I need your help to mark 1 or 0 in the GPFS column if a feature either exists
> or doesnt exist, respectively. Please also add supporting comments if a
> feature has additional info, e.g., 100PB single namespace file system
> supported, etc.
> I answered some of them which I have tested, or got the information from
> google or manuals.
> 
> 
> 9. Disk quotas based on directory. 

= 1 (per directory based on filesets which is a 'hard linked' directory to a storage pool via placement rules.)  Max filesets is 10000 in 3.5.


> Groupadmin-Visible Features
> ---------------------------------------
 
> 5. Nesting groups within groups is permitted.
> 
> 6. Groups are equal partners with users in terms of access control lists.

GPFS supports POSIX and NFS v4 ACLS (which are not quite the same as Windows ACLs)

> 7. Group managers can adjust disk quotas
> 
> 8. Group managers can create/delete/modify user spaces.
> 
> 9. Group managers can create/delete/modify group spaces.

.. .paraphrase... users with admin privs (root / sudoers) can adjust things.  How you organise your user & group administration is up to you.  This is external to GPFS.


> Sysadmin-Visible Features
> -----------------------------------
> 
> 1. Namespace is expandable and shrinkable without file system downtime.
>      (My answer:    1)
> 
> 2. Supports storage tiering (e.g., SSD, SAS, SATA, tape, grid, cloud) via some
> type of filtering, without manual user intervention (Data life-cycle
> management)

= 1 .  You can do this with GPFS policies and THRESHOLDS.  Or look at IBM's V7000 Easy Tier.

> 3. User can provide manual "hints" on where to place files based on usage
> requirements.

Do you mean the user is prompted, when you write a file?  If so, then no.  Though there is an API, so you could integrate that functionality if required, and your application defers to your GPFS API program before writes.  I suggest user education is far simpler and cheaper to maintain.  If you need prompts, your workflow is inefficient.  It should be transparent to the user.

> 4. Allows resource-configurable logical relocation or actual migration of data
> without user downtime (Hardware life-cycle
> management/patching/maintenance)

= 1

> 6. Product has at least two commercial companies providing support.

=1 Many companies provide OEM GPFS support.  Though at some point this may be backed off to IBM if a problem requires development teams.

> 9. Customized levels of data redundancy at the file/subdirectory/partition
> layer, based on user requirements.
> Replication. Load-balancing.

=1

> 10. Management software fully spoorts command line interface (CLI)

=1


> 10. Management software supports a graphical user interface (GUI)

=1 , if you buy IBM's SONAS.  Presume that v7000 has something also.

> 11. Must run on non-proprietary x86/x64 hardware (Note: this might
> eliminate some proprietary solutions that meet every other requirement.)

=1

> 13. Robust and reliable: file system must recover gracefully from an
> unscheduled power outage, and not take forever for fsck.

=1.  I've been through this personally. All good.  All cluster nodes can participate in fsck.
(Actually one of our Qlogic switches spat badness to two of our storage units which caused both units to simultaneously soft-reboot.  Apparently the Qlogic firmware couldn't handle the amount of data we transfer a day in an internal counter.  Needless to say, new firmware was required.)
 
> 14. Client code must support RHEL.
>       (My answer:  1)

> 
> 18. Affordable
> 
> 19. Value for the money.

Both above points are arguable.  Nobody knows your budget.
That said, it's cheaper to buy a GPFS system than an Isilon system of similar spec (I have both - and we're just about to switch off the Isilon due to running and expansion costs).  Stornext is just too much management overhead and constant de-fragging.

> 20. Provides native accounting information to support a storage service
> model.

What does 'Storage service model mean?'  Chargeback per GB / user?
If so, then you can write a list policy to obtain this information or use fileset quota accounting.
 
> 21. Ability to change file owner throughout file system (generalized ability
> to implement metadata changes)

=1.  You'd run a policy to do this.

> 22. Allows discrete resource allocation in case groups want physical
> resource separation, yet still allows central management.
>        Resource allocation might control bandwidth, LUNx, CPU,
> user/subdir/filesystem quotas, etc.

= 0.5.   Max bandwidth you can control.  You can't set a min.  CPU is irrelevant.  

> 23. Built-in file system compression option

No.  Perhaps you could use TSM as an external storage pool and de-dupe to VTL ?  If you backend that to tape, remember it will un-dupe as it writes to tape.

> 24. Built-in file-level replication option

=1

> 25. Built-in file system deduplication option

=0 .  I think. 

> 26. Built-in file system encryption option

=1, if you buy IBM storage with on disk encryption.  I.E. the disk is encrypted and is unreadable if removed, but the actual file system itself is not.

> 27. Support VM image movement among storage servers, including moving
> entire jobs (hypervisor requirement)

That's a huge scope.  Check your choice of VM requirements.  GPFS is just a file system. 

> 28. Security/authentication of local user to allow access (something stronger
> than host-based access)

No.  Unless you chkconfig the GPFS start scripts off and then have the user authenticate to be abel to start the script which mounts GPFS.

> 29. WAN-based file system (e.g., for disaster recover site)

=1 

> 31. Can perform OPTIONAL file system rebalancing when adding new
> storage.

=1
 
> 32. Protection from accidental, large scale deletions

=1 via snapshots.  Though that's retrospective.  No system is idiot proof.

> 33. Ability to transfer snapshots among hosts.

Unknown.  All hosts in GPFS would see the snapshot.  Transfer to a different GPFS cluster for DR, er, not quite sure.

> 34. Ability to promote snapshot to read/write partition

In what context does 'promote' mean?

> 35. Consideration given to number of metadata servers required to support
> overall service, and how that affects HA, i.e.,
>       must be able to support HA on a per namespace basis . (How many MD
> servers would we need to keep file service running?)

2 dedicated NSD servers for all namespaces is a good setup.  Though, metadata is shared between all nodes. 

> 36. Consideration given to backup and restore capabilities and compatible
> hardware/software products. Look at timeframe requirements.
>        (What backup solutions does it recommend?)

I rather like TSM.  Not tried HPSS.

> 37. Need to specify how any given file system is not POSIX-compliant so we
> understand it. Make this info available to users.
>        (What are its POSIX shortcomings?)

GPFS is POSIX compliant.  I'm personally unaware of any POSIX compatibility shortcomings.

> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss





More information about the gpfsug-discuss mailing list