[gpfsug-discuss] Mounting GPFS data on OpenStack VM

Wed Jan 18 03:05:15 GMT 2017

I agree with Aaron on option 1, trusting users to do nothing malicious would be quite a stretch for most people’s use cases.  Even if they do, if their user’s credentials getting stolen, and then used by someone else it could be a real issue as the hacker wouldn’t have to get lucky and find a VM with an un-patched root escalation vulnerability.  Security aside, you’ll probably want to make sure your VMs have an external IP that is able to be reached by the GPFS cluster.  We found having GPFS route through the Openstack NAT to be possible, but tricky (though this was an older version of Openstack…could be better now?).  Using the external IP may be the natural way for most folks, but wanted to point it out none-the-less.  

We haven’t done much in regards to option 2, we’ve done work using native clients on the hypervisors to provide cinder/glance storage, but not to share other data into the VM’s.  

Currently use option 3 to export group’s project directories to their VMs using the CES protocol nodes.  It’s getting the job done right now (have close to 100 VMs mounting from it).  I would definitely recommend giving your maxFilesToCache and maxStatCache parameters a big bump from defaults on the export nodes if you weren’t planning to already (set mine at 1,000,000 on each of those).  We saw that become a point of contention with our user’s workloads.  That change was implemented fairly recently and so far, so good.  Aaron’s point about logistics from his answer to option 1 is relevant here too, especially if you have high VM turnover rate where IP addresses are recycled and different projects are getting exported.  You’ll want to keep track of VM’s and exports to prevent a new VM from picking up an old IP that has access on an export it isn’t supposed to because it hasn’t been flushed out.  In our situation there are 30-40 projects, all names of them known to users who ls the project directory, wouldn’t take much for them to spin up a new VM and give them all a try.  

I agree this is a really interesting topic, there’s a lot of ways to come at this so hopefully more folks chime in on what they’re doing.  

Best,

J.D. Maloney
Storage Engineer | Storage Enabling Technologies Group
National Center for Supercomputing Applications (NCSA)

On 1/17/17, 6:47 PM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Aaron Knister" <gpfsug-discuss-bounces at spectrumscale.org on behalf of aaron.s.knister at nasa.gov> wrote:

    I think the 1st option creates the challenges both with security (e.g. 
    do you fully trust the users of your VMs not to do bad things as root 
    either maliciously or accidentally? how do you ensure userids are 
    properly mapped inside the guest?) and logistically (as VMs come and go 
    how do you automate adding them/removing them to/from the GPFS cluster).

    I think the 2nd option is ideal perhaps using something like 9p 
    (http://www.linux-kvm.org/page/9p_virtio) to export filesystems from the 
    hypervisor to the guest. I'm not sure how you would integrate this with 
    Nova and I've heard from others that there are stability issues, but I 
    can't comment first hand. Another option might be to NFS/CIFS export the 
    filesystems from the hypervisor to the guests via the 169.254.169.254 
    metadata address although I don't know how feasible that may or may not 
    be. The advantage to using the metadata address is it should scale well 
    and it should take the pain out of a guest mapping an IP address to its 
    local hypervisor using an external method.

    Perhaps number 3 is the best way to go, especially (arguably only) if 
    you use kerberized NFS or SMB. That way you don't have to trust anything 
    about the guest and you theoretically should get decent performance.

    I'm really curious what other folks have done on this front.

    -Aaron

    On 1/17/17 4:50 PM, Brian Marshall wrote:
    > UG,
    >
    > I have a GPFS filesystem.
    >
    > I have a OpenStack private cloud.
    >
    > What is the best way for Nova Compute VMs to have access to data inside
    > the GPFS filesystem?
    >
    > 1)Should VMs mount GPFS directly with a GPFS client?
    > 2) Should the hypervisor mount GPFS and share to nova computes?
    > 3) Should I create GPFS protocol servers that allow nova computes to
    > mount of NFS?
    >
    > All advice is welcome.
    >
    >
    > Best,
    > Brian Marshall
    > Virginia Tech
    >
    >
    > _______________________________________________
    > gpfsug-discuss mailing list
    > gpfsug-discuss at spectrumscale.org
    > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    >

    -- 
    Aaron Knister
    NASA Center for Climate Simulation (Code 606.2)
    Goddard Space Flight Center
    (301) 286-2776
    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss