[gpfsug-discuss] 4K sector NSD support (was: Hardware refresh)

Aaron Knister aaron.s.knister at nasa.gov
Wed Oct 12 01:50:38 BST 2016


Yuri,

(Sorry for being somewhat spammy) I now understand the limitation after 
some more testing (I'm a hands-on learner, can you tell?). Given the 
right code/cluster/fs version levels I can add 4K dataOnly NSDv2 NSDs to 
a filesystem created with NSDv1 NSDs. What I can't do is seemingly add 
any metadataOnly or dataAndMetadata 4K luns to an fs that is not 4K 
aligned which I assume would be any fs originally created with NSDv1 
LUNs. It seems possible to move all data away from NSDv1 LUNS in a 
filesystem behind-the-scenes using GPFS migration tools, and move the 
data to NSDv2 LUNs. In this case I believe what's missing is a tool to 
convert just the metadata structures to be 4K aligned since the data 
would already on 4k-based NSDv2 LUNS, is that the case? I'm trying to 
figure out what exactly I'm asking for in an RFE.

-Aaron

On 10/11/16 7:57 PM, Aaron Knister wrote:
> I think I was a little quick to the trigger. I re-read your last mail
> after doing some testing and understand it differently. I was wrong
> about my interpretation-- you can add 4K NSDv2 formatted NSDs to a
> filesystem previously created with NSDv1 NSDs assuming, as you say, the.
> minReleaseLevel and filesystem version are high enough. That negates
> about half of my last e-mail. The fs still doesn't show as 4K aligned:
>
> loressd01:~ # /usr/lpp/mmfs/bin/mmlsfs tnb4k --is4KAligned
> flag                value                    description
> ------------------- ------------------------
> -----------------------------------
>  --is4KAligned      No                       is4KAligned?
>
> but *shrug* most of the I/O to these disks should be 1MB anyway. If
> somebody is pounding the FS with smaller than 4K I/O they're gonna get a
> talkin' to.
>
> -Aaron
>
> On 10/11/16 6:41 PM, Aaron Knister wrote:
>> Thanks Yuri.
>>
>> I'm asking for my own purposes but I think it's still relevant here:
>> we're still at GPFS 3.5 and will be adding dataOnly NSDs with 4K sectors
>> in the near future. We're planning to update to 4.1 before we format
>> these NSDs, though. If I understand you correctly we can't bring these
>> 4K NSDv2 NSDs into a filesystem with 512b-based NSDv1 NSDs? That's a
>> pretty big deal :(
>>
>> Reformatting every few years with 10's of petabytes of data is not
>> realistic for us (it would take years to move the data around). It also
>> goes against my personal preachings about GPFS's storage virtualization
>> capabilities: the ability to perform upgrades/make underlying storage
>> infrastructure changes with behind-the-scenes data migration,
>> eliminating much of the manual hassle of storage administrators doing
>> rsync dances. I guess it's RFE time? It also seems as though AFM could
>> help with automating the migration, although many of our filesystems do
>> not have filesets on them so we would have to re-think how we lay out
>> our filesystems.
>>
>> This is also curious to me with IBM pitching GPFS as a filesystem for
>> cloud services (the cloud *never* goes down, right?). Granted I believe
>> this pitch started after the NSDv2 format was defined, but if somebody
>> is building a large cloud with GPFS as the underlying filesystem for an
>> object or an image store one might think the idea of having to re-format
>> the filesystem to gain access to critical new features is inconsistent
>> with this pitch. It would be hugely impactful. Just my $.02.
>>
>> As you can tell, I'm frustrated there's no online conversion tool :) Not
>> that there couldn't be... you all are brilliant developers.
>>
>> -Aaron
>>
>> On 10/11/16 1:22 PM, Yuri L Volobuev wrote:
>>> This depends on the committed cluster version level (minReleaseLevel)
>>> and file system format. Since NFSv2 is an on-disk format change, older
>>> code wouldn't be able to understand what it is, and thus if there's a
>>> possibility of a downlevel node looking at the NSD, the NFSv1 format is
>>> going to be used. The code does NSDv1<->NSDv2 conversions under the
>>> covers as needed when adding an empty NSD to a file system.
>>>
>>> I'd strongly recommend getting a fresh start by formatting a new file
>>> system. Many things have changed over the course of the last few years.
>>> In particular, having a 4K-aligned file system can be a pretty big deal,
>>> depending on what hardware one is going to deploy in the future, and
>>> this is something that can't be bolted onto an existing file system.
>>> Having 4K inodes is very handy for many reasons. New directory format
>>> and NSD format changes are attractive, too. And disks generally tend to
>>> get larger with time, and at some point you may want to add a disk to an
>>> existing storage pool that's larger than the existing allocation map
>>> format allows. Obviously, it's more hassle to migrate data to a new file
>>> system, as opposed to extending an existing one. In a perfect world,
>>> GPFS would offer a conversion tool that seamlessly and robustly converts
>>> old file systems, making them as good as new, but in the real world such
>>> a tool doesn't exist. Getting a clean slate by formatting a new file
>>> system every few years is a good long-term investment of time, although
>>> it comes front-loaded with extra work.
>>>
>>> yuri
>>>
>>> Inactive hide details for Aaron Knister ---10/10/2016 04:45:31 PM---Can
>>> one format NSDv2 NSDs and put them in a filesystem withAaron Knister
>>> ---10/10/2016 04:45:31 PM---Can one format NSDv2 NSDs and put them in a
>>> filesystem with NSDv1 NSD's? -Aaron
>>>
>>> From: Aaron Knister <aaron.s.knister at nasa.gov>
>>> To: <gpfsug-discuss at spectrumscale.org>,
>>> Date: 10/10/2016 04:45 PM
>>> Subject: Re: [gpfsug-discuss] Hardware refresh
>>> Sent by: gpfsug-discuss-bounces at spectrumscale.org
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>>
>>> Can one format NSDv2 NSDs and put them in a filesystem with NSDv1 NSD's?
>>>
>>> -Aaron
>>>
>>> On 10/10/16 7:40 PM, Luis Bolinches wrote:
>>>> Hi
>>>>
>>>> Creating a new FS sounds like a best way to go. NSDv2 being a very good
>>>> reason to do so.
>>>>
>>>> AFM for migrations is quite good, latest versions allows to use NSD
>>>> protocol for mounts as well. Olaf did a great job explaining this
>>>> scenario on the redbook chapter 6
>>>>
>>>> http://www.redbooks.ibm.com/abstracts/sg248254.html?Open
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> On 10 Oct 2016, at 23.05, Buterbaugh, Kevin L
>>>> <Kevin.Buterbaugh at Vanderbilt.Edu
>>>> <mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:
>>>>
>>>>> Hi Mark,
>>>>>
>>>>> The last time we did something like this was 2010 (we’re doing rolling
>>>>> refreshes now), so there are probably lots of better ways to do this
>>>>> than what we did, but we:
>>>>>
>>>>> 1) set up the new hardware
>>>>> 2) created new filesystems (so that we could make adjustments we
>>>>> wanted to make that can only be made at FS creation time)
>>>>> 3) used rsync to make a 1st pass copy of everything
>>>>> 4) coordinated a time with users / groups to do a 2nd rsync when they
>>>>> weren’t active
>>>>> 5) used symbolic links during the transition (i.e. rm -rvf
>>>>> /gpfs0/home/joeuser; ln -s /gpfs2/home/joeuser /gpfs0/home/joeuser)
>>>>> 6) once everybody was migrated, updated the symlinks (i.e. /home
>>>>> became a symlink to /gpfs2/home)
>>>>>
>>>>> HTHAL…
>>>>>
>>>>> Kevin
>>>>>
>>>>>> On Oct 10, 2016, at 2:56 PM, Mark.Bush at siriuscom.com
>>>>>> <mailto:Mark.Bush at siriuscom.com> wrote:
>>>>>>
>>>>>> Have a very old cluster built on IBM X3650’s and DS3500.  Need to
>>>>>> refresh hardware.  Any lessons learned in this process?  Is it
>>>>>> easiest to just build new cluster and then use AFM?  Add to existing
>>>>>> cluster then decommission nodes?  What is the recommended process for
>>>>>> this?
>>>>>>
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> This message (including any attachments) is intended only for the use
>>>>>> of the individual or entity to which it is addressed and may contain
>>>>>> information that is non-public, proprietary, privileged,
>>>>>> confidential, and exempt from disclosure under applicable law. If you
>>>>>> are not the intended recipient, you are hereby notified that any use,
>>>>>> dissemination, distribution, or copying of this communication is
>>>>>> strictly prohibited. This message may be viewed by parties at Sirius
>>>>>> Computer Solutions other than those named in the message header. This
>>>>>> message does not contain an official representation of Sirius
>>>>>> Computer Solutions. If you have received this communication in error,
>>>>>> notify Sirius Computer Solutions immediately and (i) destroy this
>>>>>> message if a facsimile or (ii) delete this message immediately if
>>>>>> this is an electronic communication. Thank you.
>>>>>>
>>>>>> Sirius Computer Solutions <http://www.siriuscom.com/>
>>>>>> _______________________________________________
>>>>>> gpfsug-discuss mailing list
>>>>>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>>>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>>
>>>>>>>>>> Kevin Buterbaugh - Senior System Administrator
>>>>> Vanderbilt University - Advanced Computing Center for Research and
>>>>> Education
>>>>> Kevin.Buterbaugh at vanderbilt.edu
>>>>> <mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633
>>>>>
>>>>>
>>>>>
>>>>
>>>> Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
>>>> Oy IBM Finland Ab
>>>> PL 265, 00101 Helsinki, Finland
>>>> Business ID, Y-tunnus: 0195876-3
>>>> Registered in Finland
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



More information about the gpfsug-discuss mailing list