[gpfsug-discuss] mmcrfs issue

Buterbaugh, Kevin L Kevin.Buterbaugh at Vanderbilt.Edu
Wed Mar 15 14:25:41 GMT 2017


Hi All,

Since I started this thread I guess I should chime in, too … for us it was simply that we were testing a device that did not have hardware RAID controllers and we were wanting to implement something roughly equivalent to RAID 6 LUNs.

Kevin

> On Mar 14, 2017, at 5:16 PM, Aaron Knister <aaron.s.knister at nasa.gov> wrote:
> 
> For me it's the protection against bitrot and added protection against silent data corruption and in theory the write caching offered by adding log devices that could help with small random writes (although there are other problems with ZFS + synchronous workloads that stop this from actually materializing).
> 
> -Aaron
> 
> On 3/14/17 5:59 PM, Luke Raimbach wrote:
>> Can I ask what the fascination with zvols is? Using a copy-on-write file
>> system to underpin another block based file system seems
>> counterintuitive. Perhaps I've missed something vital, in which case I'd
>> be delighted to have my eyes opened!
>> 
>> On Tue, 14 Mar 2017, 00:06 Aaron Knister, <aaron.s.knister at nasa.gov
>> <mailto:aaron.s.knister at nasa.gov>> wrote:
>> 
>>    I was doing this in testing (with fantastic performance too) until I
>>    realized the issue with ZFS's behavior with direct io on zvols (e.g. not
>>    flushing a write to stable storage after acknowledging it to GPFS).
>>    After setting the sync=always parameter to not lose data in the event of
>>    a crash or power outage the write performance became unbearably slow
>>    (under 100MB/s of writes for an 8+2 RAIDZ2 if I recall correctly). I
>>    even tried adding a battery-backed PCIe write cache
>>    (http://www.netlist.com/products/vault-memory-storage/expressvault-pcIe-ev3/default.aspx)
>>    as a log device to the zpool but the performance was still really slow.
>>    I posted to the ZFS mailing list asking about how to optimize for a
>>    large block streaming workload but I didn't many bites
>>    (http://list.zfsonlinux.org/pipermail/zfs-discuss/2016-February/024851.html).
>> 
>>    I've got an RFE open with IBM
>>    (https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=84994)
>>    to see if the behavior of GPFS could be changed such that it would issue
>>    explicit cache flushes that would allow it to work with ZFS (it might
>>    even be beneficial in FPO environments too).
>> 
>>    -Aaron
>> 
>>    On 3/13/17 4:44 PM, Buterbaugh, Kevin L wrote:
>>    > Hi All,
>>    >
>>    > Two things:
>>    >
>>    > 1) Paul’s suggestion to look at the nsddevices script was the answer I
>>    > needed to fix my mmcrfs issue.  Thanks.
>>    >
>>    > 2) I am also interested in hearing if anyone is using ZFS to create the
>>    > equivalent of RAID 8+2P hardware RAID 6 LUNs and presenting that to GPFS
>>    > to use as disks?
>>    >
>>    > Thanks…
>>    >
>>    > Kevin
>>    >
>>    >> On Mar 11, 2017, at 4:37 AM, Daniel Kidger <daniel.kidger at uk.ibm.com <mailto:daniel.kidger at uk.ibm.com>
>>    >> <mailto:daniel.kidger at uk.ibm.com <mailto:daniel.kidger at uk.ibm.com>>> wrote:
>>    >>
>>    >> On the subject of using zvols for software Raid/ replication, can ask
>>    >> as a quick poll, how many people are doing this?
>>    >>
>>    >> And any feedback on stability, tuning and performance?
>>    >>
>>    >> Daniel
>>    >> IBM Technical Presales
>>    >>
>>    >> > On 10 Mar 2017, at 22:44, Aaron Knister <aaron.s.knister at nasa.gov <mailto:aaron.s.knister at nasa.gov>
>>    <mailto:aaron.s.knister at nasa.gov <mailto:aaron.s.knister at nasa.gov>>>
>>    wrote:
>>    >> >
>>    >> > Those look like zvol's. Out of curiosity have you set sync=always on the
>>    >> > filesystem root or zvols themselves? It's my understanding that without
>>    >> > that you risk data loss since GPFS won't ever cause a sync to be issued
>>    >> > to the zvol for zfs to flush acknowledged but uncommitted writes.
>>    >> >
>>    >> > -Aaron
>>    >> >
>>    >> >> On 3/10/17 4:36 PM, Sanchez, Paul wrote:
>>    >> >> See:
>>    >> >> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_nsddevices.htm
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> GPFS has a limited set of device search specs it uses to find connected
>>    >> >> NSDs. When using exotic devices, you need to whitelist the devices
>>    >> >> yourself using the user exit script at /var/mmfs/etc/nsddevices.
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> *From:*gpfsug-discuss-bounces at spectrumscale.org
>>    <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>>    >> <mailto:gpfsug-discuss-bounces at spectrumscale.org
>>    <mailto:gpfsug-discuss-bounces at spectrumscale.org>>
>>    >> >> [mailto:gpfsug-discuss-bounces at spectrumscale.org
>>    <mailto:gpfsug-discuss-bounces at spectrumscale.org>] *On Behalf Of
>>    >> >> *Buterbaugh, Kevin L
>>    >> >> *Sent:* Friday, March 10, 2017 3:44 PM
>>    >> >> *To:* gpfsug main discussion list
>>    >> >> *Subject:* [gpfsug-discuss] mmcrfs issue
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> Hi All,
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> We are testing out some flash storage. I created a couple of NSDs
>>    >> >> successfully (?):
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> root at nec:~/gpfs# mmlsnsd -F
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> File system Disk name NSD servers
>>    >> >>
>>    >> >> ---------------------------------------------------------------------------
>>    >> >>
>>    >> >> (free disk) nsd1 nec
>>    >> >>
>>    >> >> (free disk) nsd2 nec
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> root at nec:~/gpfs#
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> So I tried to create a filesystem:
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> root at nec:~/gpfs# mmcrfs gpfs0 -F ~/gpfs/flash.stanza -A yes -B 1M -j
>>    >> >> scatter -k all -m 1 -M 3 -Q no -r 1 -R 3 -T /gpfs0
>>    >> >>
>>    >> >> GPFS: 6027-441 Unable to open disk 'nsd2' on node nec.
>>    >> >>
>>    >> >> GPFS: 6027-441 Unable to open disk 'nsd1' on node nec.
>>    >> >>
>>    >> >> No such device
>>    >> >>
>>    >> >> No such device
>>    >> >>
>>    >> >> GPFS: 6027-538 Error accessing disks.
>>    >> >>
>>    >> >> mmcrfs: 6027-1200 tscrfs failed. Cannot create gpfs0
>>    >> >>
>>    >> >> mmcrfs: 6027-1639 Command failed. Examine previous error messages to
>>    >> >> determine cause.
>>    >> >>
>>    >> >> root at nec:~/gpfs#
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> Does this output from readdescraw look normal?
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> root at nec:~/gpfs# mmfsadm test readdescraw /dev/zd16
>>    >> >>
>>    >> >> NSD descriptor in sector 64 of /dev/zd16
>>    >> >>
>>    >> >> NSDid: 0A0023D258C1C02C format version: 1403 Label:
>>    >> >>
>>    >> >> Paxos sector: -1931478434 number of sectors: 8192 isPdisk: 0
>>    >> >>
>>    >> >> Comment: NSD descriptor for <unknown> Thu Mar 9 14:50:52 2017
>>    >> >>
>>    >> >> No Disk descriptor in sector 96 of /dev/zd16
>>    >> >>
>>    >> >> No FS descriptor in sector 2048 of /dev/zd16
>>    >> >>
>>    >> >> root at nec:~/gpfs# mmfsadm test readdescraw /dev/zd32
>>    >> >>
>>    >> >> NSD descriptor in sector 64 of /dev/zd32
>>    >> >>
>>    >> >> NSDid: 0A0023D258C1C02B format version: 1403 Label:
>>    >> >>
>>    >> >> Paxos sector: -1880562609 number of sectors: 8192 isPdisk: 0
>>    >> >>
>>    >> >> Comment: NSD descriptor for <unknown> Thu Mar 9 14:50:51 2017
>>    >> >>
>>    >> >> No Disk descriptor in sector 96 of /dev/zd32
>>    >> >>
>>    >> >> No FS descriptor in sector 2048 of /dev/zd32
>>    >> >>
>>    >> >> root at nec:~/gpfs#
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> Thanks in advance, all…
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> Kevin
>>    >> >>
>>    >> >> —
>>    >> >>
>>    >> >> Kevin Buterbaugh - Senior System Administrator
>>    >> >>
>>    >> >> Vanderbilt University - Advanced Computing Center for Research and Education
>>    >> >>
>>    >> >> Kevin.Buterbaugh at vanderbilt.edu
>>    <mailto:Kevin.Buterbaugh at vanderbilt.edu>
>>    <mailto:Kevin.Buterbaugh at vanderbilt.edu
>>    <mailto:Kevin.Buterbaugh at vanderbilt.edu>>
>>    >> >> <mailto:Kevin.Buterbaugh at vanderbilt.edu
>>    <mailto:Kevin.Buterbaugh at vanderbilt.edu>> - (615)875-9633
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >>
>>    >> >> _______________________________________________
>>    >> >> gpfsug-discuss mailing list
>>    >> >> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> <http://spectrumscale.org>
>>    >> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>    >> >>
>>    >> >
>>    >> > --
>>    >> > Aaron Knister
>>    >> > NASA Center for Climate Simulation (Code 606.2)
>>    >> > Goddard Space Flight Center
>>    >> > (301) 286-2776
>>    >> > _______________________________________________
>>    >> > gpfsug-discuss mailing list
>>    >> > gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> <http://spectrumscale.org>
>>    >> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>    >> >
>>    >> Unless stated otherwise above:
>>    >> IBM United Kingdom Limited - Registered in England and Wales with
>>    >> number 741598.
>>    >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>    >>
>>    >> _______________________________________________
>>    >> gpfsug-discuss mailing list
>>    >> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> <http://spectrumscale.org>
>>    >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>    >
>>    >
>>    >
>>    > _______________________________________________
>>    > gpfsug-discuss mailing list
>>    > gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>    > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>    >
>> 
>>    --
>>    Aaron Knister
>>    NASA Center for Climate Simulation (Code 606.2)
>>    Goddard Space Flight Center
>>    (301) 286-2776
>>    _______________________________________________
>>    gpfsug-discuss mailing list
>>    gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> 
>> 
>> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> 
> 
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



More information about the gpfsug-discuss mailing list