[gpfsug-discuss] HDFS protocol in 4.2

Stijn De Weirdt stijn.deweirdt at ugent.be
Mon Nov 30 19:31:49 GMT 2015


hi all,

the gpfs 4.2.0 advanced administration guide has a section on HDFS
protocol. while reading it, i'm a bit puzzled if this has any advantage
for a non-FPO site.

we are are still experimenting with the "regular" gpfs hadoop connector,
so it would be nice to hear any advantages (besides protocol
transparency) over the hadoop connector.

in particular performance comes to mind ;)

the admin guide advises to enable local read, which seems understandable
for FPO, but what does this mean for a non-FPO site? sending data over
RPC is proabably worse performance wise compare to the gpfs hadoop binding.

also, are there any other advantages possible with a proper name and
data node services from hdfs protocol? (like zero copy shuffle on gpfs,
something that didn't seem to exist with the connector during some tests
we ran, and which was a bit disappointing, beging a shared filesystem
and all that)

many thanks,

stijn



More information about the gpfsug-discuss mailing list