From orlando.richards at ed.ac.uk  Wed Dec  4 11:37:06 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Wed, 04 Dec 2013 11:37:06 +0000
Subject: [gpfsug-discuss] GPFS samba config
In-Reply-To: <1380792182.19439.18.camel@buzzard.phy.strath.ac.uk>
References: <CE71EFF4.200B4%jfosburg@mdanderson.org>
	<1380792182.19439.18.camel@buzzard.phy.strath.ac.uk>
Message-ID: <529F13E2.2000703@ed.ac.uk>

On 03/10/13 10:23, Jonathan Buzzard wrote:
> There are a couple of
> other options as allowSambaCaseInsensitiveLookup and
> syncSambaMetadataOps but I have not determined exactly what they do...

Hi JAB - I think I've figured out what syncSambaMetadataOps does - I 
think it's a performance optimsation for the vfs_syncops module. With 
syncSambaMetadataOps set, you can then set:
  syncops:onmeta = no

I found this in:
http://sambaxp.org/fileadmin/user_upload/SambaXP2011-DATA/day2track1/SambaXP%202011%20Ambach%20Dietz.pdf

under "metadata synchronization performance" (slides 21 & 22) referred 
to as the mysterious "new (internal) GPFS parameter"

Of course, since it's undocumented this is purely guesswork on my part - 
would be good to get confirmation from IBM on this.

Also - I'd love to get the output of "testparm -v" and "mmfsadm dump 
config" from an IBM SONAS cluster node to see what the best practise 
GPFS + samba settings are. Any chance of that from an IBMer?

Cheers,
Orlando

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From jonathan at buzzard.me.uk  Wed Dec  4 12:13:08 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Wed, 04 Dec 2013 12:13:08 +0000
Subject: [gpfsug-discuss] GPFS samba config
In-Reply-To: <529F13E2.2000703@ed.ac.uk>
References: <CE71EFF4.200B4%jfosburg@mdanderson.org>
	<1380792182.19439.18.camel@buzzard.phy.strath.ac.uk>
	<529F13E2.2000703@ed.ac.uk>
Message-ID: <1386159188.3488.65.camel@buzzard.phy.strath.ac.uk>

On Wed, 2013-12-04 at 11:37 +0000, Orlando Richards wrote:
> On 03/10/13 10:23, Jonathan Buzzard wrote:
> > There are a couple of
> > other options as allowSambaCaseInsensitiveLookup and
> > syncSambaMetadataOps but I have not determined exactly what they do...
> 
> Hi JAB - I think I've figured out what syncSambaMetadataOps does - I 
> think it's a performance optimsation for the vfs_syncops module. With 
> syncSambaMetadataOps set, you can then set:
>   syncops:onmeta = no
> 
> I found this in:
> http://sambaxp.org/fileadmin/user_upload/SambaXP2011-DATA/day2track1/SambaXP%202011%20Ambach%20Dietz.pdf
> 
> under "metadata synchronization performance" (slides 21 & 22) referred 
> to as the mysterious "new (internal) GPFS parameter"
> 
> Of course, since it's undocumented this is purely guesswork on my part - 
> would be good to get confirmation from IBM on this.

That's a really good catch, and the only one of the magic levers I had
no real idea what it was doing. The allowSambaCaseInsensitiveLookup is
easy, it is basically getting the file system to do the matching when
you have "case sensitive = no" in your smb.conf, rather than Samba
trying lots of combinations. What I don't have a handle on is what the
performance benefit for setting it is.

I am working up on doing testing on what the samba ACL's do. My
hypothesis is that it makes the NFSv4 ACL's follow NTFS schematics. This
should be relatively easy to test.

> 
> Also - I'd love to get the output of "testparm -v" and "mmfsadm dump 
> config" from an IBM SONAS cluster node to see what the best practise 
> GPFS + samba settings are. Any chance of that from an IBMer?
> 

I don't think that will work. Try flipping some of the magic levers on a
test cluster and you will see what I mean. Basically they don't show up
in the output of mmfsadm dump :-(

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From chair at gpfsug.org  Thu Dec  5 17:21:42 2013
From: chair at gpfsug.org (Chair)
Date: Thu, 05 Dec 2013 17:21:42 +0000
Subject: [gpfsug-discuss] 3.5.0.15 changelog
Message-ID: <52A0B626.3090501@gpfsug.org>

I think this is my favourite changelog message of all time.

* The default mount point for a GPFS file system cannot be set to "/".

Who found that one out? :-)

Jez
---
GPFS UG Chair

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131205/2195fd1e/attachment.htm>

From chair at gpfsug.org  Thu Dec  5 17:53:42 2013
From: chair at gpfsug.org (Chair)
Date: Thu, 05 Dec 2013 17:53:42 +0000
Subject: [gpfsug-discuss] 3.5.0.15 changelog
In-Reply-To: <52A0B626.3090501@gpfsug.org>
References: <52A0B626.3090501@gpfsug.org>
Message-ID: <52A0BDA6.8040006@gpfsug.org>

Correct, from .13, not .15.  Still. Genius.

On 05/12/13 17:21, Chair wrote:
> I think this is my favourite changelog message of all time.
>
> * The default mount point for a GPFS file system cannot be set to "/".
>
> Who found that one out? :-)
>
> Jez
> ---
> GPFS UG Chair
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131205/ace6d2b8/attachment.htm>

From richard.lefebvre at calculquebec.ca  Sat Dec  7 00:31:53 2013
From: richard.lefebvre at calculquebec.ca (Richard Lefebvre)
Date: Fri, 06 Dec 2013 19:31:53 -0500
Subject: [gpfsug-discuss] Looking for a way to see which node is having an
	impact on server?
Message-ID: <52A26C79.7010706@calculquebec.ca>

Hi,

I'm looking for a way to see which node (or nodes) is having an impact
on the gpfs server nodes which is slowing the whole file system? What
happens, usually, is a user is doing some I/O that doesn't fit the
configuration of the gpfs file system and the way it was explain on how
to use it efficiently.  It is usually by doing a lot of unbuffered byte
size, very random I/O on the file system that was made for large files
and large block size.

My problem is finding out who is doing that. I haven't found a way to
pinpoint the node or nodes that could be the source of the problem, with
over 600 client nodes.

I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
that I cannot pinpoint on something.

I must be missing something simple. Anyone got any help?

Note: there is another thing I'm trying to pinpoint. A temporary
imbalance was created by adding a new NSD. It seems that a group of
files have been created on that same NSD and a user keeps hitting that
NSD causing a high load.  I'm trying to pinpoint the origin of that too.
At least until everything is balance back. But will balancing spread
those files since they are already on the most empty NSD?

Richard


From chekh at stanford.edu  Mon Dec  9 19:52:52 2013
From: chekh at stanford.edu (Alex Chekholko)
Date: Mon, 09 Dec 2013 11:52:52 -0800
Subject: [gpfsug-discuss] Looking for a way to see which node is having
 an impact on server?
In-Reply-To: <52A26C79.7010706@calculquebec.ca>
References: <52A26C79.7010706@calculquebec.ca>
Message-ID: <52A61F94.7080702@stanford.edu>

Hi Richard,

I would just use something like 'iftop' to look at the traffic between 
the nodes.  Or 'collectl'.  Or 'dstat'.

e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
http://dag.wiee.rs/home-made/dstat/

For the NSD balance question, since GPFS stripes the blocks evenly 
across all the NSDs, they will end up balanced over time.  Or you can 
rebalance manually with 'mmrestripefs -b' or similar.

It is unlikely that particular files ended up on a single NSD, unless 
the other NSDs are totally full.

Regards,
Alex

On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
> Hi,
>
> I'm looking for a way to see which node (or nodes) is having an impact
> on the gpfs server nodes which is slowing the whole file system? What
> happens, usually, is a user is doing some I/O that doesn't fit the
> configuration of the gpfs file system and the way it was explain on how
> to use it efficiently.  It is usually by doing a lot of unbuffered byte
> size, very random I/O on the file system that was made for large files
> and large block size.
>
> My problem is finding out who is doing that. I haven't found a way to
> pinpoint the node or nodes that could be the source of the problem, with
> over 600 client nodes.
>
> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
> that I cannot pinpoint on something.
>
> I must be missing something simple. Anyone got any help?
>
> Note: there is another thing I'm trying to pinpoint. A temporary
> imbalance was created by adding a new NSD. It seems that a group of
> files have been created on that same NSD and a user keeps hitting that
> NSD causing a high load.  I'm trying to pinpoint the origin of that too.
> At least until everything is balance back. But will balancing spread
> those files since they are already on the most empty NSD?
>
> Richard
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
Alex Chekholko chekh at stanford.edu


From richard.lefebvre at calculquebec.ca  Mon Dec  9 21:05:41 2013
From: richard.lefebvre at calculquebec.ca (Richard Lefebvre)
Date: Mon, 09 Dec 2013 16:05:41 -0500
Subject: [gpfsug-discuss] Looking for a way to see which node is having
 an impact on server?
In-Reply-To: <52A61F94.7080702@stanford.edu>
References: <52A26C79.7010706@calculquebec.ca> <52A61F94.7080702@stanford.edu>
Message-ID: <52A630A5.4040400@calculquebec.ca>

Hi Alex,

I should have mention that my GPFS network is done through
infiniband/RDMA, so looking at the TCP probably won't work. I will try
to see if the traffic can be seen through ib0 (instead of eth0), but I
have my doubts.

As for the placement. The file system was 95% full when I added the new
NSDs. I know that what is waiting now from the waiters commands is the
to the 2 NSDs:

waiting 0.791707000 seconds, NSDThread: for I/O completion on disk d9

I have added more NSDs since then but the waiting is still on the 2
disks. None of the others.

Richard

On 12/09/2013 02:52 PM, Alex Chekholko wrote:
> Hi Richard,
> 
> I would just use something like 'iftop' to look at the traffic between
> the nodes.  Or 'collectl'.  Or 'dstat'.
> 
> e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
> http://dag.wiee.rs/home-made/dstat/
> 
> For the NSD balance question, since GPFS stripes the blocks evenly
> across all the NSDs, they will end up balanced over time.  Or you can
> rebalance manually with 'mmrestripefs -b' or similar.
> 
> It is unlikely that particular files ended up on a single NSD, unless
> the other NSDs are totally full.
> 
> Regards,
> Alex
> 
> On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
>> Hi,
>>
>> I'm looking for a way to see which node (or nodes) is having an impact
>> on the gpfs server nodes which is slowing the whole file system? What
>> happens, usually, is a user is doing some I/O that doesn't fit the
>> configuration of the gpfs file system and the way it was explain on how
>> to use it efficiently.  It is usually by doing a lot of unbuffered byte
>> size, very random I/O on the file system that was made for large files
>> and large block size.
>>
>> My problem is finding out who is doing that. I haven't found a way to
>> pinpoint the node or nodes that could be the source of the problem, with
>> over 600 client nodes.
>>
>> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
>> that I cannot pinpoint on something.
>>
>> I must be missing something simple. Anyone got any help?
>>
>> Note: there is another thing I'm trying to pinpoint. A temporary
>> imbalance was created by adding a new NSD. It seems that a group of
>> files have been created on that same NSD and a user keeps hitting that
>> NSD causing a high load.  I'm trying to pinpoint the origin of that too.
>> At least until everything is balance back. But will balancing spread
>> those files since they are already on the most empty NSD?
>>
>> Richard
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
> 


From chekh at stanford.edu  Mon Dec  9 21:21:24 2013
From: chekh at stanford.edu (Alex Chekholko)
Date: Mon, 09 Dec 2013 13:21:24 -0800
Subject: [gpfsug-discuss] Looking for a way to see which node is having
 an impact on server?
In-Reply-To: <52A630A5.4040400@calculquebec.ca>
References: <52A26C79.7010706@calculquebec.ca> <52A61F94.7080702@stanford.edu>
	<52A630A5.4040400@calculquebec.ca>
Message-ID: <52A63454.7010605@stanford.edu>

Hi Richard,

For IB traffic, you can use 'collectl -sx' 
http://collectl.sourceforge.net/Infiniband.html
or else mmpmon (which is what 'dstat --gpfs' uses underneath anyway)

If your other NSDs are full, then of course all writes will go to the 
empty NSDs.  And then reading those new files your performance will be 
limited to just the new NSDs.


Regards,
Alex

On 12/09/2013 01:05 PM, Richard Lefebvre wrote:
> Hi Alex,
>
> I should have mention that my GPFS network is done through
> infiniband/RDMA, so looking at the TCP probably won't work. I will try
> to see if the traffic can be seen through ib0 (instead of eth0), but I
> have my doubts.
>
> As for the placement. The file system was 95% full when I added the new
> NSDs. I know that what is waiting now from the waiters commands is the
> to the 2 NSDs:
>
> waiting 0.791707000 seconds, NSDThread: for I/O completion on disk d9
>
> I have added more NSDs since then but the waiting is still on the 2
> disks. None of the others.
>
> Richard
>
> On 12/09/2013 02:52 PM, Alex Chekholko wrote:
>> Hi Richard,
>>
>> I would just use something like 'iftop' to look at the traffic between
>> the nodes.  Or 'collectl'.  Or 'dstat'.
>>
>> e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
>> http://dag.wiee.rs/home-made/dstat/
>>
>> For the NSD balance question, since GPFS stripes the blocks evenly
>> across all the NSDs, they will end up balanced over time.  Or you can
>> rebalance manually with 'mmrestripefs -b' or similar.
>>
>> It is unlikely that particular files ended up on a single NSD, unless
>> the other NSDs are totally full.
>>
>> Regards,
>> Alex
>>
>> On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
>>> Hi,
>>>
>>> I'm looking for a way to see which node (or nodes) is having an impact
>>> on the gpfs server nodes which is slowing the whole file system? What
>>> happens, usually, is a user is doing some I/O that doesn't fit the
>>> configuration of the gpfs file system and the way it was explain on how
>>> to use it efficiently.  It is usually by doing a lot of unbuffered byte
>>> size, very random I/O on the file system that was made for large files
>>> and large block size.
>>>
>>> My problem is finding out who is doing that. I haven't found a way to
>>> pinpoint the node or nodes that could be the source of the problem, with
>>> over 600 client nodes.
>>>
>>> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
>>> that I cannot pinpoint on something.
>>>
>>> I must be missing something simple. Anyone got any help?
>>>
>>> Note: there is another thing I'm trying to pinpoint. A temporary
>>> imbalance was created by adding a new NSD. It seems that a group of
>>> files have been created on that same NSD and a user keeps hitting that
>>> NSD causing a high load.  I'm trying to pinpoint the origin of that too.
>>> At least until everything is balance back. But will balancing spread
>>> those files since they are already on the most empty NSD?
>>>
>>> Richard
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at gpfsug.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
Alex Chekholko chekh at stanford.edu 347-401-4860


From v.andrews at noc.ac.uk  Tue Dec 10 09:59:06 2013
From: v.andrews at noc.ac.uk (Andrews, Vincent)
Date: Tue, 10 Dec 2013 09:59:06 +0000
Subject: [gpfsug-discuss] Looking for a way to see which node is having
 an impact on server?
In-Reply-To: <52A63454.7010605@stanford.edu>
Message-ID: <CECC945B.1652A%v.andrews@noc.ac.uk>

We do not have as many client nodes, but we have an extensive Ganglia
configuration that monitors all of the nodes on our network.

For the client nodes we also run a script that pushes stats into Ganglia
using 'mmpmon'.

Using this we have been able to locate problem machines a lot quicker.

I have attached the script, and is released on a 'it works for me' term.

We run it every minute from cron.

Vince.

--

Vincent Andrews
NOC,
European Way,
Southampton ,
SO14 3ZH
Ext. 27616
External 023 80597616


This e-mail (and any attachments) is confidential and intended solely for
the use of the individual or entity to whom it is addressed. Both NERC and
the University of Southampton (who operate NOCS as a collaboration) are
subject to the Freedom of Information Act 2000. The information contained
in this e-mail and any reply you make may be disclosed unless it is
legally from disclosure. Any material supplied to NOCS may be stored in
the electronic records management system of either the University or NERC
as appropriate.


On 09/12/2013 21:21, "Alex Chekholko" <chekh at stanford.edu> wrote:

>Hi Richard,
>
>For IB traffic, you can use 'collectl -sx'
>http://collectl.sourceforge.net/Infiniband.html
>or else mmpmon (which is what 'dstat --gpfs' uses underneath anyway)
>
>If your other NSDs are full, then of course all writes will go to the
>empty NSDs.  And then reading those new files your performance will be
>limited to just the new NSDs.
>
>
>Regards,
>Alex
>
>On 12/09/2013 01:05 PM, Richard Lefebvre wrote:
>> Hi Alex,
>>
>> I should have mention that my GPFS network is done through
>> infiniband/RDMA, so looking at the TCP probably won't work. I will try
>> to see if the traffic can be seen through ib0 (instead of eth0), but I
>> have my doubts.
>>
>> As for the placement. The file system was 95% full when I added the new
>> NSDs. I know that what is waiting now from the waiters commands is the
>> to the 2 NSDs:
>>
>> waiting 0.791707000 seconds, NSDThread: for I/O completion on disk d9
>>
>> I have added more NSDs since then but the waiting is still on the 2
>> disks. None of the others.
>>
>> Richard
>>
>> On 12/09/2013 02:52 PM, Alex Chekholko wrote:
>>> Hi Richard,
>>>
>>> I would just use something like 'iftop' to look at the traffic between
>>> the nodes.  Or 'collectl'.  Or 'dstat'.
>>>
>>> e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
>>> http://dag.wiee.rs/home-made/dstat/
>>>
>>> For the NSD balance question, since GPFS stripes the blocks evenly
>>> across all the NSDs, they will end up balanced over time.  Or you can
>>> rebalance manually with 'mmrestripefs -b' or similar.
>>>
>>> It is unlikely that particular files ended up on a single NSD, unless
>>> the other NSDs are totally full.
>>>
>>> Regards,
>>> Alex
>>>
>>> On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
>>>> Hi,
>>>>
>>>> I'm looking for a way to see which node (or nodes) is having an impact
>>>> on the gpfs server nodes which is slowing the whole file system? What
>>>> happens, usually, is a user is doing some I/O that doesn't fit the
>>>> configuration of the gpfs file system and the way it was explain on
>>>>how
>>>> to use it efficiently.  It is usually by doing a lot of unbuffered
>>>>byte
>>>> size, very random I/O on the file system that was made for large files
>>>> and large block size.
>>>>
>>>> My problem is finding out who is doing that. I haven't found a way to
>>>> pinpoint the node or nodes that could be the source of the problem,
>>>>with
>>>> over 600 client nodes.
>>>>
>>>> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
>>>> that I cannot pinpoint on something.
>>>>
>>>> I must be missing something simple. Anyone got any help?
>>>>
>>>> Note: there is another thing I'm trying to pinpoint. A temporary
>>>> imbalance was created by adding a new NSD. It seems that a group of
>>>> files have been created on that same NSD and a user keeps hitting that
>>>> NSD causing a high load.  I'm trying to pinpoint the origin of that
>>>>too.
>>>> At least until everything is balance back. But will balancing spread
>>>> those files since they are already on the most empty NSD?
>>>>
>>>> Richard
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at gpfsug.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
>--
>Alex Chekholko chekh at stanford.edu 347-401-4860
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at gpfsug.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ganglia_gpfs_client_stats.pl
Type: text/x-perl-script
Size: 3701 bytes
Desc: ganglia_gpfs_client_stats.pl
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131210/4ec4400e/attachment.bin>

From viccornell at gmail.com  Tue Dec 10 10:13:20 2013
From: viccornell at gmail.com (Vic Cornell)
Date: Tue, 10 Dec 2013 10:13:20 +0000
Subject: [gpfsug-discuss] Looking for a way to see which node is having
	an impact on server?
In-Reply-To: <52A61F94.7080702@stanford.edu>
References: <52A26C79.7010706@calculquebec.ca> <52A61F94.7080702@stanford.edu>
Message-ID: <6726F05D-3332-4FF4-AB9D-F78B542E2249@gmail.com>

Have you looked at mmpmon? Its a bit much for 600 nodes but if you run it with a reasonable interface specified then the output shouldn't be too hard to parse.

Quick recipe:

create a file called mmpmon.conf that looks like 


################# cut here #########################
nlist add node1 node2 node3 node4 node5
io_s
reset
################# cut here #########################

Where node1,node2 etc are your node names - it might be as well to do this for batches of 50 or so.

then run something like:

/usr/lpp/mmfs/bin/mmpmon -i mmpmon.conf -d 10000 -r 0 -p

That will give you a set of stats for all of your named nodes aggregated over a 10 second period

Dont run more than one of these as each one will reset the stats for the other :-)


parse out the stats with something like:

awk -F_ '{if ($2=="io"){print $8,$16/1024/1024,$18/1024/1024}}'

which will give you read and write throughput.

The docs (GPFS advanced Administration Guide) are reasonable.

Cheers,

Vic Cornell
viccornell at gmail.com


On 9 Dec 2013, at 19:52, Alex Chekholko <chekh at stanford.edu> wrote:

> Hi Richard,
> 
> I would just use something like 'iftop' to look at the traffic between the nodes.  Or 'collectl'.  Or 'dstat'.
> 
> e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
> http://dag.wiee.rs/home-made/dstat/
> 
> For the NSD balance question, since GPFS stripes the blocks evenly across all the NSDs, they will end up balanced over time.  Or you can rebalance manually with 'mmrestripefs -b' or similar.
> 
> It is unlikely that particular files ended up on a single NSD, unless the other NSDs are totally full.
> 
> Regards,
> Alex
> 
> On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
>> Hi,
>> 
>> I'm looking for a way to see which node (or nodes) is having an impact
>> on the gpfs server nodes which is slowing the whole file system? What
>> happens, usually, is a user is doing some I/O that doesn't fit the
>> configuration of the gpfs file system and the way it was explain on how
>> to use it efficiently.  It is usually by doing a lot of unbuffered byte
>> size, very random I/O on the file system that was made for large files
>> and large block size.
>> 
>> My problem is finding out who is doing that. I haven't found a way to
>> pinpoint the node or nodes that could be the source of the problem, with
>> over 600 client nodes.
>> 
>> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
>> that I cannot pinpoint on something.
>> 
>> I must be missing something simple. Anyone got any help?
>> 
>> Note: there is another thing I'm trying to pinpoint. A temporary
>> imbalance was created by adding a new NSD. It seems that a group of
>> files have been created on that same NSD and a user keeps hitting that
>> NSD causing a high load.  I'm trying to pinpoint the origin of that too.
>> At least until everything is balance back. But will balancing spread
>> those files since they are already on the most empty NSD?
>> 
>> Richard
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> 
> 
> -- 
> Alex Chekholko chekh at stanford.edu
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From stefan.fritzsche at slub-dresden.de  Tue Dec 10 10:16:35 2013
From: stefan.fritzsche at slub-dresden.de (Stefan Fritzsche)
Date: Tue, 10 Dec 2013 11:16:35 +0100
Subject: [gpfsug-discuss] backup and hsm with gpfs
Message-ID: <52A6EA03.30101@slub-dresden.de>

Dear gpfsug,

we are the SLUB, Saxon State and University Library Dresden.

Our goal is to build a long term preservation system. We use gpfs and a 
tsm with hsm integration to backup, migrate and distribute the data over 
two computing centers.
Currently, we are making backups with the normal tsm ba-client.
Our pre-/migration runs with the gpfs-policy engine to find all files 
that are in the state "rersistent" and match some additional rules.
After the scan, we create a filelist and premigrate the data with dsmmigfs.

The normal backup takes a long time for the scan of the whole 
gpfs-filesystem, so we are looking for a better way to perfom the backups.
I know that i can also use the policy engine to perfom the backup but my 
questions are:

How do I perform backups with gpfs?

Is there anyone who uses the mmbackup command or mmbackup in companies 
with snapshots?

Does anyone have any expirence in writing an application with gpfs-api 
and/or dmapi?

Thank you for your answers and proposals.

Best regards,
Stefan


-- 
Stefan Fritzsche

SLUB
email: stefan.fritzsche at slub-dresden.de
Zellescher Weg 18
---------------------------------------------


From chekh at stanford.edu  Tue Dec 10 19:36:24 2013
From: chekh at stanford.edu (Alex Chekholko)
Date: Tue, 10 Dec 2013 11:36:24 -0800
Subject: [gpfsug-discuss] backup and hsm with gpfs
In-Reply-To: <52A6EA03.30101@slub-dresden.de>
References: <52A6EA03.30101@slub-dresden.de>
Message-ID: <52A76D38.8010301@stanford.edu>

Hi Stefan,

Since you're using TSM with GPFS, are you following their current 
integration instructions?  My understanding is that what you want is a 
regular use case of TSM/GPFS backups.

For file system scans, I believe that the policy engine scales linearly 
with the number of nodes you run it on.  Can you add more storage nodes? 
  Or run your policy scans across more existing nodes?

Regards,
Alex


On 12/10/13, 2:16 AM, Stefan Fritzsche wrote:
> Dear gpfsug,
>
> we are the SLUB, Saxon State and University Library Dresden.
>
> Our goal is to build a long term preservation system. We use gpfs and a
> tsm with hsm integration to backup, migrate and distribute the data over
> two computing centers.
> Currently, we are making backups with the normal tsm ba-client.
> Our pre-/migration runs with the gpfs-policy engine to find all files
> that are in the state "rersistent" and match some additional rules.
> After the scan, we create a filelist and premigrate the data with dsmmigfs.
>
> The normal backup takes a long time for the scan of the whole
> gpfs-filesystem, so we are looking for a better way to perfom the backups.
> I know that i can also use the policy engine to perfom the backup but my
> questions are:
>
> How do I perform backups with gpfs?
>
> Is there anyone who uses the mmbackup command or mmbackup in companies
> with snapshots?
>
> Does anyone have any expirence in writing an application with gpfs-api
> and/or dmapi?
>
> Thank you for your answers and proposals.
>
> Best regards,
> Stefan
>
>

-- 
chekh at stanford.edu


From ewahl at osc.edu  Tue Dec 10 20:34:02 2013
From: ewahl at osc.edu (Ed Wahl)
Date: Tue, 10 Dec 2013 20:34:02 +0000
Subject: [gpfsug-discuss] backup and hsm with gpfs
In-Reply-To: <52A6EA03.30101@slub-dresden.de>
References: <52A6EA03.30101@slub-dresden.de>
Message-ID: <C59E5201836F7147BAD35189FFBB35D1BF91744E@USOAPP09V04P.si.lan>

I have moderate experience with mmbackup, dmapi (though NOT with custom apps) and both the Tivoli HSM and the newer LTFS-EE product (relies on dsmmigfs for a backend). 

How much time does a standard dsmc backup scan take?  And an mmapplypolicy scan?

So you have both a normal backup with dsmc today and also want to push to HSM with policy engine?  Are these separate storage destinations?  If they are the same,  perhaps using mmbackup and making DR copies inside TSM is better? Or would that affect other systems being backup up to TSM?  Or perhaps configure a storage pool for TSM that only handles the special files such that they don't mix tapes?


   mmbackup uses the standard policy engine scans with (unfortunately) a set # of directory and scan threads (defaults to 24 but ends up a somewhat higher # on first node of the backup) unlike a standard mmapplypolicy where you can adjust the thread levels and the only adjustment is "-m #" which adjusts how many dsmc threads/processes run per node.

Overall I find the mmbackup with multi-node support to be _much_ faster than the linear dsmc scans.  _WAY_ too thread heavy and insanely high IO loads on smaller GPFS's with mid-range to slower metadata though. (almost all IO Wait with loads well over 100 on an 8 core server) 

Depending on your version of both TSM and GPFS you can quickly convert from dsmc schedule to mmbackup with snapshots using -q or -rebuild options.  Be aware there are some versions of GPFS that do NOT work with snapshots and mmbackup, and there are quite a few gotchas in the TSM integration. The largest of which is if you normally use TSM virtualmountpoints. That is NOT supported in GPFS. It will backup happily, but restoration is more amusing and it creates a TSM filespace per vmp.  This currently breaks the shadowDB badly and makes 'rebuilds' damn near impossible in the newest GPFS and just annoying in older versions.

All that being said, the latest version of GPFS and anything above about TSM 6.4.x client seem to work well for us. 

Ed Wahl
OSC

________________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Stefan Fritzsche [stefan.fritzsche at slub-dresden.de]
Sent: Tuesday, December 10, 2013 5:16 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] backup and hsm with gpfs

Dear gpfsug,

we are the SLUB, Saxon State and University Library Dresden.

Our goal is to build a long term preservation system. We use gpfs and a
tsm with hsm integration to backup, migrate and distribute the data over
two computing centers.
Currently, we are making backups with the normal tsm ba-client.
Our pre-/migration runs with the gpfs-policy engine to find all files
that are in the state "rersistent" and match some additional rules.
After the scan, we create a filelist and premigrate the data with dsmmigfs.

The normal backup takes a long time for the scan of the whole
gpfs-filesystem, so we are looking for a better way to perfom the backups.
I know that i can also use the policy engine to perfom the backup but my
questions are:

How do I perform backups with gpfs?

Is there anyone who uses the mmbackup command or mmbackup in companies
with snapshots?

Does anyone have any expirence in writing an application with gpfs-api
and/or dmapi?

Thank you for your answers and proposals.

Best regards,
Stefan


--
Stefan Fritzsche

SLUB
email: stefan.fritzsche at slub-dresden.de
Zellescher Weg 18
---------------------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From rltodd.ml1 at gmail.com  Thu Dec 12 18:14:08 2013
From: rltodd.ml1 at gmail.com (Lindsay Todd)
Date: Thu, 12 Dec 2013 13:14:08 -0500
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
Message-ID: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>

Hello,

Since this is my first note to the group, I'll introduce myself first.  I
am Lindsay Todd, a Systems Programmer at Rensselaer Polytechnic Institute's
Center for Computational Innovations, where I run a 1.2PiB GPFS cluster
serving a Blue Gene/Q and a variety of Opteron and Intel clients, run an
IBM Watson, and serve as an adjunct faculty.  I also do some freelance
consulting, including GPFS, for several customers.

One of my customers is needing to serve GPFS storage through both NFS and
Samba; they have GPFS 3.5 running on RHEL5 (not RHEL6) servers.  I did not
set this up for them, but was called to help fix it.  Currently they export
NFS using cNFS; I think we have that straightened out server-side now.
 Also they run Samba on several of the servers; I'm sure the group will not
be surprised to hear they experience file corruption and other strange
problems.

I've been pushing them to use Samba-CTDB, and it looks like it will happen.
 Except, I've never used this myself.  So this raises a couple questions:

1) It looks like RHEL5 bundles in an old version of CTDB. Should that be
used, or would we be better with a build from the Enterprise Samba site, or
even a build from source?

2) Given that CTDB can also run NFS, what are people who need both finding
works best: run both cNFS + Samba-CTDB, or let CTDB run both?  It seems to
me that if I let CTDB run both, I only need a single floating IP address
for each server, while if I also use cNFS, I will want a floating address
for both NFS and Samba, on each server.

Thanks for the help!

R. Lindsay Todd, PhD
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131212/85608384/attachment.htm>

From seanlee at tw.ibm.com  Fri Dec 13 12:53:14 2013
From: seanlee at tw.ibm.com (Sean S Lee)
Date: Fri, 13 Dec 2013 20:53:14 +0800
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <mailman.1.1386936002.17231.gpfsug-discuss@gpfsug.org>
References: <mailman.1.1386936002.17231.gpfsug-discuss@gpfsug.org>
Message-ID: <OF042D19AA.406B0908-ON48257C40.0045FFD7-48257C40.0046E135@tw.ibm.com>


Hi Todd

Regarding your second question: it's usually better to use distinct IP's
for the two services.
If they're both using the same set of virtual IP's, then a failure of one
service will cause the associated virtual IP to failover elsewhere which
can be disruptive for users of the other service running at that VIP.

Ideally they would use just one file sharing protocol, so maybe you can
heck if that is possible.

Regards
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131213/7e33e293/attachment.htm>

From orlando.richards at ed.ac.uk  Fri Dec 13 15:15:03 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Fri, 13 Dec 2013 15:15:03 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
Message-ID: <52AB2477.50802@ed.ac.uk>

On 12/12/13 18:14, Lindsay Todd wrote:
> Hello,
>
> Since this is my first note to the group, I'll introduce myself first.
>   I am Lindsay Todd, a Systems Programmer at Rensselaer Polytechnic
> Institute's Center for Computational Innovations, where I run a 1.2PiB
> GPFS cluster serving a Blue Gene/Q and a variety of Opteron and Intel
> clients, run an IBM Watson, and serve as an adjunct faculty.  I also do
> some freelance consulting, including GPFS, for several customers.
>
> One of my customers is needing to serve GPFS storage through both NFS
> and Samba; they have GPFS 3.5 running on RHEL5 (not RHEL6) servers.  I
> did not set this up for them, but was called to help fix it.  Currently
> they export NFS using cNFS; I think we have that straightened out
> server-side now.  Also they run Samba on several of the servers; I'm
> sure the group will not be surprised to hear they experience file
> corruption and other strange problems.
>
> I've been pushing them to use Samba-CTDB, and it looks like it will
> happen.  Except, I've never used this myself.  So this raises a couple
> questions:
>
> 1) It looks like RHEL5 bundles in an old version of CTDB. Should that be
> used, or would we be better with a build from the Enterprise Samba site,
> or even a build from source?
>

Hi Lindsay,

We rebuild ctdb from the (git) source (in the 1.2.40 branch currently), 
after running into performance problems with the sernet bundled version 
(1.0.114). It's easy to build:

git clone git://git.samba.org/ctdb.git ctdb.git
cd ctdb.git
git branch -r
git checkout -b "my_build" origin/1.2.40
cd packaging/RPM/
./makerpms.sh
yum install /root/rpmbuild/RPMS/x86_64/ctdb*.rpm

I then take the Sernet src rpm and rebuild it, using ctdb.h from the 
above rather than the 1.0.114 version they use. This is possibly not 
required, but I thought it best to be sure that the differing headers 
wouldn't cause any problems. I remain, as ever, very grateful to Sernet 
for providing these!


> 2) Given that CTDB can also run NFS, what are people who need both
> finding works best: run both cNFS + Samba-CTDB, or let CTDB run both?
>   It seems to me that if I let CTDB run both, I only need a single
> floating IP address for each server, while if I also use cNFS, I will
> want a floating address for both NFS and Samba, on each server.
>

We let CTDB run both, but we didn't come to that decision by comparing 
the merits of both options. I think Bristol (Bob Cregan is cc'd, I'm not 
sure he's on this list) run cNFS and CTDB side by side. As you say - 
you'd at least require different IP addresses to do that.


> Thanks for the help!

Best of luck :)

>
> R. Lindsay Todd, PhD
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.


From jonathan at buzzard.me.uk  Fri Dec 13 15:31:03 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Fri, 13 Dec 2013 15:31:03 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AB2477.50802@ed.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
Message-ID: <1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>

On Fri, 2013-12-13 at 15:15 +0000, Orlando Richards wrote:

[SNIP]

> Hi Lindsay,
> 
> We rebuild ctdb from the (git) source (in the 1.2.40 branch currently), 
> after running into performance problems with the sernet bundled version 
> (1.0.114). It's easy to build:

Interestingly the RHEL7 beta is shipping ctdb 2.1 in combination with
Samba 4.1

http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/Packages/

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From orlando.richards at ed.ac.uk  Fri Dec 13 15:35:01 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Fri, 13 Dec 2013 15:35:01 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
Message-ID: <52AB2925.1020809@ed.ac.uk>

On 13/12/13 15:31, Jonathan Buzzard wrote:
> On Fri, 2013-12-13 at 15:15 +0000, Orlando Richards wrote:
>
> [SNIP]
>
>> Hi Lindsay,
>>
>> We rebuild ctdb from the (git) source (in the 1.2.40 branch currently),
>> after running into performance problems with the sernet bundled version
>> (1.0.114). It's easy to build:
>
> Interestingly the RHEL7 beta is shipping ctdb 2.1 in combination with
> Samba 4.1
>
> http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/Packages/
>
> JAB.
>
The samba team are currently working to bring ctdb into the main samba 
source tree - so hopefully this will become a moot point soon!


-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.


From oehmes at us.ibm.com  Fri Dec 13 18:14:45 2013
From: oehmes at us.ibm.com (Sven Oehme)
Date: Fri, 13 Dec 2013 10:14:45 -0800
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AB2925.1020809@ed.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
Message-ID: <OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>

Orlando, 

give that there are so many emails/interest on this topic in recent month, 
let me share some personal expertise on this :-)

any stock Samba or CTDB version you will find any distro is not sufficient 
and it doesn't matter which you choose (SLES, RHEL or any form of debian 
and any version of all of them). 
the reason is that Samba doesn't have the GPFS header and library files 
included in its source, and at compile time it dynamically 
enables/disables all GPFS related things based on the availability of the 
GPFS packages . as non of the distros build machines have GPFS installed 
all this packages end up with  binaries in their rpms which don't have the 
required code enabled to properly support GPFS and non of the vfs modules 
get build either. 

the only way to get something working (don't get confused with officially 
Supported) is to recompile the CTDB src packages AND the Samba src 
packages on a node that has GPFS already installed. also the inclusion of 
CTDB into Samba will not address this, its just a more convenient 
packaging. 

Only if the build happens on such a node things like the vfs modules for 
GPFS are build and included in the package. 

said all this the binaries alone are only part of the Solution, after you 
have the correct packages, you need to properly configuration the system 
and setting all the right options (on GPFS as well as on CTDB and 
smbd.conf), which unfortunate are very System configuration specific, as 
otherwise you still can end up with data corruption if not set right.

also some people in the past have used a single instance of Samba to 
export shares over CIFS as they believe its a safe alternative to a more 
complex CTDB setup. also here a word of caution, even if you have a single 
instance of Samba running on top of GPFS you are exposed to potential data 
corruption if you don't use the proper Samba version (explained above) and 
the proper configuration, you can skip CTDB For that, but you still 
require a proper compiled version of Samba with GPFS code installed on the 
build machine. 

to be very clear the problem is not GPFS, its that Samba does locking AND 
caching on top of the Filesystem without GPFS knowledge if you don't use 
the right code/config to 'tell' GPFS about it, so GPFS can not ensure data 
consistency, not even on the same physical node for data thats shared over 
CIFS. 

there are unfortunate no shortcuts. 

i also have to point out that if you recompile the packages and configure 
everything correctly this is most likely to work, but you won't get 
official support for the CIFS part of this setup from IBM.

This email is not an official Statement/Response of IBM, see it as 
personal 'AS-IS' Information sharing.

Sven


From:   Orlando Richards <orlando.richards at ed.ac.uk>
To:     gpfsug-discuss at gpfsug.org
Date:   12/13/2013 07:35 AM
Subject:        Re: [gpfsug-discuss] GPFS and both Samba and NFS
Sent by:        gpfsug-discuss-bounces at gpfsug.org


On 13/12/13 15:31, Jonathan Buzzard wrote:
> On Fri, 2013-12-13 at 15:15 +0000, Orlando Richards wrote:
>
> [SNIP]
>
>> Hi Lindsay,
>>
>> We rebuild ctdb from the (git) source (in the 1.2.40 branch currently),
>> after running into performance problems with the sernet bundled version
>> (1.0.114). It's easy to build:
>
> Interestingly the RHEL7 beta is shipping ctdb 2.1 in combination with
> Samba 4.1
>
> http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/Packages/
>
> JAB.
>
The samba team are currently working to bring ctdb into the main samba 
source tree - so hopefully this will become a moot point soon!


-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131213/57d2bc60/attachment.htm>

From jonathan at buzzard.me.uk  Mon Dec 16 11:02:13 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 16 Dec 2013 11:02:13 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AB2925.1020809@ed.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
Message-ID: <1387191733.7230.8.camel@buzzard.phy.strath.ac.uk>

On Fri, 2013-12-13 at 15:35 +0000, Orlando Richards wrote:
> On 13/12/13 15:31, Jonathan Buzzard wrote:
> > On Fri, 2013-12-13 at 15:15 +0000, Orlando Richards wrote:
> >
> > [SNIP]
> >
> >> Hi Lindsay,
> >>
> >> We rebuild ctdb from the (git) source (in the 1.2.40 branch currently),
> >> after running into performance problems with the sernet bundled version
> >> (1.0.114). It's easy to build:
> >
> > Interestingly the RHEL7 beta is shipping ctdb 2.1 in combination with
> > Samba 4.1
> >
> > http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/Packages/
> >
> > JAB.
> >
> The samba team are currently working to bring ctdb into the main samba 
> source tree - so hopefully this will become a moot point soon!

Yes I am aware of this. The point of bringing up what is going into
RHEL7 was to get a flavour of what RedHat consider stable enough to push
out into a supported enterprise product.

I always ran my GPFS/Samba/CTDB clusters by taking a stock RHEL Samba,
and patching the spec file to build the vfs_gpfs module possibly with
extra patches to the vfs_gpfs module for bug fixes against the GPFS
version running on the cluster and have it produce a suitable RPM with
just the vfs_gpfs module that I could then load into the stock RHEL
Samba.

It would appear that RedHat are doing something similar in RHEL7 with a
vfs_glusterfs RPM for Samba.

Of course even with CTDB in Samba you are still going to need to do some
level of rebuilding because you won't get the vfs_gpfs module otherwise.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From jonathan at buzzard.me.uk  Mon Dec 16 11:21:29 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 16 Dec 2013 11:21:29 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
Message-ID: <1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>

On Fri, 2013-12-13 at 10:14 -0800, Sven Oehme wrote:

[SNIP]

> the only way to get something working (don't get confused with
> officially Supported) is to recompile the CTDB src packages AND the
> Samba src packages on a node that has GPFS already installed. also the
> inclusion of CTDB into Samba will not address this, its just a more
> convenient packaging. 
> 
> Only if the build happens on such a node things like the vfs modules
> for GPFS are build and included in the package. 
> 

That is a factually inaccurate statement. There is nothing in CTDB that
is GPFS specific. Trust me I have examined the code closely to determine
if this is the case. So unless this has changed recently you are flat
out wrong.

Consequently there is no requirement whatsoever to rebuild CTDB to get
the vfs_gpfs module. In addition there is also no requirement to
actually have GPFS installed to build the vfs_gpfs module either. What
you need to have is the GPFS GPL header files and nothing else. As it is
a loadable VFS module linking takes place at load time not compile time.

It is unworthy of an IBM employee to spread such inaccurate
misinformation.

[SNIP]

> said all this the binaries alone are only part of the Solution, after
> you have the correct packages, you need to properly configuration the
> system and setting all the right options (on GPFS as well as on CTDB
> and smbd.conf), which unfortunate are very System configuration
> specific, as otherwise you still can end up with data corruption if
> not set right. 

Indeed. However I know not only what those options are, but also what
they do despite IBM's refusal to tell us anything about them.

I would also point out that there are sites that where running Samba on
top of GPFS for many years before IBM began offering their
SONAS/Storwize Unifed products.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From chair at gpfsug.org  Mon Dec 16 11:30:22 2013
From: chair at gpfsug.org (Chair)
Date: Mon, 16 Dec 2013 11:30:22 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
Message-ID: <52AEE44E.1030805@gpfsug.org>

Allo

   Just jumping in here a minute:

> It is unworthy of an IBM employee to spread such inaccurate misinformation.

Whilst this may be inaccurate - I very, very, much doubt that IBM or 
their employees have a secret master plan to spread misinformation (!)
In the spirit of this group, let's work together to technically look at 
such issues.

Sven, if that is the case, perhaps you could crib the lines of code / 
show your methodology that supports your views / experience.


Regards,

Jez
--
UG Chair


On 16/12/13 11:21, Jonathan Buzzard wrote:
> On Fri, 2013-12-13 at 10:14 -0800, Sven Oehme wrote:
>
> [SNIP]
>
>> the only way to get something working (don't get confused with
>> officially Supported) is to recompile the CTDB src packages AND the
>> Samba src packages on a node that has GPFS already installed. also the
>> inclusion of CTDB into Samba will not address this, its just a more
>> convenient packaging.
>>
>> Only if the build happens on such a node things like the vfs modules
>> for GPFS are build and included in the package.
>>
> That is a factually inaccurate statement. There is nothing in CTDB that
> is GPFS specific. Trust me I have examined the code closely to determine
> if this is the case. So unless this has changed recently you are flat
> out wrong.
>
> Consequently there is no requirement whatsoever to rebuild CTDB to get
> the vfs_gpfs module. In addition there is also no requirement to
> actually have GPFS installed to build the vfs_gpfs module either. What
> you need to have is the GPFS GPL header files and nothing else. As it is
> a loadable VFS module linking takes place at load time not compile time.
>
> It is unworthy of an IBM employee to spread such inaccurate
> misinformation.
>
> [SNIP]
>
>> said all this the binaries alone are only part of the Solution, after
>> you have the correct packages, you need to properly configuration the
>> system and setting all the right options (on GPFS as well as on CTDB
>> and smbd.conf), which unfortunate are very System configuration
>> specific, as otherwise you still can end up with data corruption if
>> not set right.
> Indeed. However I know not only what those options are, but also what
> they do despite IBM's refusal to tell us anything about them.
>
> I would also point out that there are sites that where running Samba on
> top of GPFS for many years before IBM began offering their
> SONAS/Storwize Unifed products.
>
>
> JAB.
>


From orlando.richards at ed.ac.uk  Mon Dec 16 12:26:16 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Mon, 16 Dec 2013 12:26:16 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <1387191733.7230.8.camel@buzzard.phy.strath.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>
	<1387191733.7230.8.camel@buzzard.phy.strath.ac.uk>
Message-ID: <52AEF168.1000701@ed.ac.uk>

On 16/12/13 11:02, Jonathan Buzzard wrote:
> On Fri, 2013-12-13 at 15:35 +0000, Orlando Richards wrote:
>> On 13/12/13 15:31, Jonathan Buzzard wrote:
>>> On Fri, 2013-12-13 at 15:15 +0000, Orlando Richards wrote:
>>>
>>> [SNIP]
>>>
>>>> Hi Lindsay,
>>>>
>>>> We rebuild ctdb from the (git) source (in the 1.2.40 branch currently),
>>>> after running into performance problems with the sernet bundled version
>>>> (1.0.114). It's easy to build:
>>>
>>> Interestingly the RHEL7 beta is shipping ctdb 2.1 in combination with
>>> Samba 4.1
>>>
>>> http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/Packages/
>>>
>>> JAB.
>>>
>> The samba team are currently working to bring ctdb into the main samba
>> source tree - so hopefully this will become a moot point soon!
>
> Yes I am aware of this. The point of bringing up what is going into
> RHEL7 was to get a flavour of what RedHat consider stable enough to push
> out into a supported enterprise product.
>
> I always ran my GPFS/Samba/CTDB clusters by taking a stock RHEL Samba,
> and patching the spec file to build the vfs_gpfs module possibly with
> extra patches to the vfs_gpfs module for bug fixes against the GPFS
> version running on the cluster and have it produce a suitable RPM with
> just the vfs_gpfs module that I could then load into the stock RHEL
> Samba.
>
> It would appear that RedHat are doing something similar in RHEL7 with a
> vfs_glusterfs RPM for Samba.
>
> Of course even with CTDB in Samba you are still going to need to do some
> level of rebuilding because you won't get the vfs_gpfs module otherwise.
>
>
>

Sernet include GPFS in their builds - they truly are wonderful :)


-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.


From orlando.richards at ed.ac.uk  Mon Dec 16 12:31:47 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Mon, 16 Dec 2013 12:31:47 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AEE44E.1030805@gpfsug.org>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
Message-ID: <52AEF2B3.5030805@ed.ac.uk>

On 16/12/13 11:30, Chair wrote:
> Allo
>
>    Just jumping in here a minute:
>
>> It is unworthy of an IBM employee to spread such inaccurate
>> misinformation.
>
> Whilst this may be inaccurate - I very, very, much doubt that IBM or
> their employees have a secret master plan to spread misinformation (!)
> In the spirit of this group, let's work together to technically look at
> such issues.
>
> Sven, if that is the case, perhaps you could crib the lines of code /
> show your methodology that supports your views / experience.
>

Presumably this all comes down to the locking setup? Relevant to this, 
I've got:

(from samba settings:)
vfs objects = shadow_copy2, fileid, gpfs, syncops
clustering = yes
gpfs:sharemodes = yes
syncops:onmeta = no
blocking locks = Yes
fake oplocks = No
kernel oplocks = Yes
locking = Yes
oplocks = Yes
level2 oplocks = Yes
oplock contention limit = 2
posix locking = Yes
strict locking = Auto


(gpfs settings:)
syncSambaMetadataOps yes

>
> Regards,
>
> Jez
> --
> UG Chair
>
>
>
> On 16/12/13 11:21, Jonathan Buzzard wrote:
>> On Fri, 2013-12-13 at 10:14 -0800, Sven Oehme wrote:
>>
>> [SNIP]
>>
>>> the only way to get something working (don't get confused with
>>> officially Supported) is to recompile the CTDB src packages AND the
>>> Samba src packages on a node that has GPFS already installed. also the
>>> inclusion of CTDB into Samba will not address this, its just a more
>>> convenient packaging.
>>>
>>> Only if the build happens on such a node things like the vfs modules
>>> for GPFS are build and included in the package.
>>>
>> That is a factually inaccurate statement. There is nothing in CTDB that
>> is GPFS specific. Trust me I have examined the code closely to determine
>> if this is the case. So unless this has changed recently you are flat
>> out wrong.
>>
>> Consequently there is no requirement whatsoever to rebuild CTDB to get
>> the vfs_gpfs module. In addition there is also no requirement to
>> actually have GPFS installed to build the vfs_gpfs module either. What
>> you need to have is the GPFS GPL header files and nothing else. As it is
>> a loadable VFS module linking takes place at load time not compile time.
>>
>> It is unworthy of an IBM employee to spread such inaccurate
>> misinformation.
>>
>> [SNIP]
>>
>>> said all this the binaries alone are only part of the Solution, after
>>> you have the correct packages, you need to properly configuration the
>>> system and setting all the right options (on GPFS as well as on CTDB
>>> and smbd.conf), which unfortunate are very System configuration
>>> specific, as otherwise you still can end up with data corruption if
>>> not set right.
>> Indeed. However I know not only what those options are, but also what
>> they do despite IBM's refusal to tell us anything about them.
>>
>> I would also point out that there are sites that where running Samba on
>> top of GPFS for many years before IBM began offering their
>> SONAS/Storwize Unifed products.
>>
>>
>> JAB.
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.


From jonathan at buzzard.me.uk  Mon Dec 16 12:49:06 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 16 Dec 2013 12:49:06 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AEE44E.1030805@gpfsug.org>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
Message-ID: <1387198146.7230.51.camel@buzzard.phy.strath.ac.uk>

On Mon, 2013-12-16 at 11:30 +0000, Chair wrote:
> Allo
> 
>    Just jumping in here a minute:
> 
> > It is unworthy of an IBM employee to spread such inaccurate misinformation.
> 
> Whilst this may be inaccurate - I very, very, much doubt that IBM or 
> their employees have a secret master plan to spread misinformation (!)
> In the spirit of this group, let's work together to technically look at 
> such issues.

You made me begin to douby myself so I have double checked. Specifically
the ctdb-2.1-3.el7.src.rpm from the RHEL7 beta and it is as I calmed.

There is no compilable code that is linked too or dependant on GPFS.
Some of the support scripts (specifically the 60.ganesha and 62.cnfs)
are GPFS aware, but no recompilation is needed as they are copied
through to the binary RPM as is.

You either know for a definite fact that CTDB needs recompiling on a
machine with GPFS installed or you are simply spreading falsehoods. Why
make such a long and detailed post if you don't know it to be true in
the first place?

> 
> Sven, if that is the case, perhaps you could crib the lines of code / 
> show your methodology that supports your views / experience.
> 

There is none, because he is flat out 100% wrong. I did extensive
research on this when deploying a GPFS/Samba/CTDB solution. I need a
recompile to get the vfs_gpfs module and I naturally as was not going to
use the RHEL supplied CTDB rpms without thoroughly checking out whether
they needed recompiling.

Note this all pre-dated Sernet offering pre-compiled binaries with the
vfs_gpfs module included. Personally I prefer running with standard RHEL
binaries with my own vfs_gpfs. It makes it easy to have a test setup
that is plain Samba so one can narrow things down to being CTDB related
very easily. In addition you don't know which version of the GPFS
headers Sernet have compiled against.

It also helps to have pucka Windows SMB servers as well so you can rule
out client side issues. The sort of why is Powerpoint 2011 on a Mac
taking minutes to save a file, where saving it under a different name
takes seconds. When it exhibits the same behaviour on a Windows server
you know there is nothing wrong with your Samba servers.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From orlando.richards at ed.ac.uk  Mon Dec 16 13:34:59 2013
From: orlando.richards at ed.ac.uk (Orlando Richards)
Date: Mon, 16 Dec 2013 13:34:59 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <1387198146.7230.51.camel@buzzard.phy.strath.ac.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>	<52AEE44E.1030805@gpfsug.org>
	<1387198146.7230.51.camel@buzzard.phy.strath.ac.uk>
Message-ID: <52AF0183.9030809@ed.ac.uk>

On 16/12/13 12:49, Jonathan Buzzard wrote:
>   In addition you don't know which version of the GPFS
> headers Sernet have compiled against.

Good point - in fact, it's an older version that I have on my GPFS 3.5 
setup. I've updated my build process to fix that - thanks Jab :)

-- 
             --
    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.


From oehmes at us.ibm.com  Mon Dec 16 15:31:53 2013
From: oehmes at us.ibm.com (Sven Oehme)
Date: Mon, 16 Dec 2013 07:31:53 -0800
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AEE44E.1030805@gpfsug.org>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
Message-ID: <OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>

Jez, 

the other replies to my email where kind of unexpected as i was indeed 
intending to help, i will continue to provide details and try to answer 
serious questions and comments. 

also as you point out there is no secret master plan to spread 
misinformation, the topic is simply complicated and there are reasons that 
are hard to explain why there is no full official support outside the 
current shipping IBM products and i will not dive into discussions around 
it. 

a few words of clarification, on the controversial point, compile ctdb 
yes/no. 
while its technically correct, that you don't have to recompile ctdb and 
it has no code that links/depends in any form on GPFS (samba does), you 
still have to compile one if the one shipped with your distro is out of 
date and while i didn't explicitly wrote it this way, this is what i 
intended to say. 

the version RH ships is very old and since has gotten many fixes which i 
highly recommend people to use unless the distros will pickup more recent 
version like RHEL7 will do. i probably wouldn't recommend a daily git 
build, but rather the packaged versions that are hosted on ftp.samba.org 
like : https://ftp.samba.org/pub/ctdb/packages/redhat/RHEL6/ 

so the proposed order of things would be to install gpfs, pull the src 
package of ctdb, compile and install ctdb and the devel packages, then 
pull a recent samba src package , install all the dependencies and build 
samba on this same host with gpfs and ctdb packages already installed. the 
resulting rpm's should contain the proper code to continue.

after you got your versions compiled, installed and the basics of ctdb 
running, you should use the following smb.conf as a starting point : 

[global]
        netbios name = cluster
        fileid:mapping = fsname
        gpfs:sharemodes = yes
        gpfs:leases = yes
        gpfs:dfreequota = yes
        gpfs:hsm = yes
        syncops:onmeta = no
        kernel oplocks = no
        level2 oplocks = yes
        notify:inotify = no
        vfs objects = shadow_copy2 syncops gpfs fileid
        shadow:snapdir = .snapshots
        shadow:fixinodes = yes
        shadow:snapdirseverywhere = yes
        shadow:sort = desc
        wide links = no
        async smb echo handler = yes
        smbd:backgroundqueue = False
        use sendfile = no
        strict locking = yes
        posix locking = yes
        force unknown acl user = yes
        nfs4:mode = simple
        nfs4:chown = yes
        nfs4:acedup = merge
        nfs4:sidmap = /etc/samba/sidmap.tdb
        gpfs:winattr = yes
        store dos attributes = yes
        map readonly = yes
        map archive = yes
        map system = yes
        map hidden = yes
        ea support = yes
        dmapi support = no
        unix extensions = no
        socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 
TCP_KEEPIDLE=240 TCP_KEEPINTVL=15
        strict allocate = yes
        gpfs:prealloc = yes


if you don't configure using the registry you need to maintain the 
smb.conf file on all your nodes and i am not diving into how to setup the 
registry, but for the people who care, michael adam's presentation is a 
good starting point 
http://www.samba.org/~obnox/presentations/linux-kongress-2008/lk2008-obnox.pdf 


also depending on your authentication/idmap setup , there are multiple 
changes/additions that needs to be made.

on the gpfs side you should set the following config parameters : 

cifsBypassShareLocksOnRename=yes
syncSambaMetadataOps=yes
cifsBypassTraversalChecking=yes
allowSambaCaseInsensitiveLookup=no

your filesystem should have the following settings configured :

 -D                 nfs4                     File locking semantics in 
effect
 -k                 nfs4                     ACL semantics in effect
 -o                 nfssync                  Additional mount options
 --fastea           Yes                      Fast external attributes 
enabled?
 -S                 relatime                 Suppress atime mount option

the -S is a performance optimization, but you need to check if your 
backup/av or other software can deal with it, it essentially reduces the 
frequency of atime updates to once every 24 hours which is the new default 
on nfs mounts and others as well. 

a lot of the configuration parameters above are also above and beyond 
locking and data integrity, they are for better windows compatibility and 
should be checked if applicable for your environment.

i would also recommend to run on GPFS 3.5 TL3 or newer to get the proper 
GPFS level of fixes for this type of configurations. 

i would like to repeat that i don't write this email to encourage people 
to all go start installing/ configuring samba on top of GPFS as i pointed 
out that you are kind on your own unless you can read source code and/or 
have somebody who does and is able to help as soon as you run into a 
problem. 
the main point of sharing this information is to clarify a lot of 
misinformation out there and provide the people who already have setups 
that are incorrect the information to at least not run into data 
corruption issues do to wrong configuration. 

Sven


From:   Chair <chair at gpfsug.org>
To:     gpfsug-discuss at gpfsug.org
Date:   12/16/2013 03:31 AM
Subject:        Re: [gpfsug-discuss] GPFS and both Samba and NFS
Sent by:        gpfsug-discuss-bounces at gpfsug.org


Allo

   Just jumping in here a minute:

> It is unworthy of an IBM employee to spread such inaccurate 
misinformation.

Whilst this may be inaccurate - I very, very, much doubt that IBM or 
their employees have a secret master plan to spread misinformation (!)
In the spirit of this group, let's work together to technically look at 
such issues.

Sven, if that is the case, perhaps you could crib the lines of code / 
show your methodology that supports your views / experience.


Regards,

Jez
--
UG Chair


On 16/12/13 11:21, Jonathan Buzzard wrote:
> On Fri, 2013-12-13 at 10:14 -0800, Sven Oehme wrote:
>
> [SNIP]
>
>> the only way to get something working (don't get confused with
>> officially Supported) is to recompile the CTDB src packages AND the
>> Samba src packages on a node that has GPFS already installed. also the
>> inclusion of CTDB into Samba will not address this, its just a more
>> convenient packaging.
>>
>> Only if the build happens on such a node things like the vfs modules
>> for GPFS are build and included in the package.
>>
> That is a factually inaccurate statement. There is nothing in CTDB that
> is GPFS specific. Trust me I have examined the code closely to determine
> if this is the case. So unless this has changed recently you are flat
> out wrong.
>
> Consequently there is no requirement whatsoever to rebuild CTDB to get
> the vfs_gpfs module. In addition there is also no requirement to
> actually have GPFS installed to build the vfs_gpfs module either. What
> you need to have is the GPFS GPL header files and nothing else. As it is
> a loadable VFS module linking takes place at load time not compile time.
>
> It is unworthy of an IBM employee to spread such inaccurate
> misinformation.
>
> [SNIP]
>
>> said all this the binaries alone are only part of the Solution, after
>> you have the correct packages, you need to properly configuration the
>> system and setting all the right options (on GPFS as well as on CTDB
>> and smbd.conf), which unfortunate are very System configuration
>> specific, as otherwise you still can end up with data corruption if
>> not set right.
> Indeed. However I know not only what those options are, but also what
> they do despite IBM's refusal to tell us anything about them.
>
> I would also point out that there are sites that where running Samba on
> top of GPFS for many years before IBM began offering their
> SONAS/Storwize Unifed products.
>
>
> JAB.
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131216/cc2e7d62/attachment.htm>

From chair at gpfsug.org  Mon Dec 16 16:05:26 2013
From: chair at gpfsug.org (Chair)
Date: Mon, 16 Dec 2013 16:05:26 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>	<52AEE44E.1030805@gpfsug.org>
	<OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
Message-ID: <52AF24C6.1000105@gpfsug.org>

Hi Sven,

   Many thanks for taking the time to write back in detail.

This is exactly the sort of discussion the user group is aimed at - I.E. 
technical discussion outside of the GPFS Developer Forum.
I heartily encourage other IBMers to get involved.

Regards,

Jez

On 16/12/13 15:31, Sven Oehme wrote:
> Jez,
>
> the other replies to my email where kind of unexpected as i was indeed 
> intending to help, i will continue to provide details and try to 
> answer serious questions and comments.
>
> also as you point out there is no secret master plan to spread 
> misinformation, the topic is simply complicated and there are reasons 
> that are hard to explain why there is no full official support outside 
> the current shipping IBM products and i will not dive into discussions 
> around it.
>
> a few words of clarification, on the controversial point, compile ctdb 
> yes/no.
> while its technically correct, that you don't have to recompile ctdb 
> and it has no code that links/depends in any form on GPFS (samba 
> does), you still have to compile one if the one shipped with your 
> distro is out of date and while i didn't explicitly wrote it this way, 
> this is what i intended to say.
>
> the version RH ships is very old and since has gotten many fixes which 
> i highly recommend people to use unless the distros will pickup more 
> recent version like RHEL7 will do. i probably wouldn't recommend a 
> daily git build, but rather the packaged versions that are hosted on 
> ftp.samba.org like : 
> _https://ftp.samba.org/pub/ctdb/packages/redhat/RHEL6/_
>
> so the proposed order of things would be to install gpfs, pull the src 
> package of ctdb, compile and install ctdb and the devel packages, then 
> pull a recent samba src package , install all the dependencies and 
> build samba on this same host with gpfs and ctdb packages already 
> installed. the resulting rpm's should contain the proper code to 
> continue.
>
> after you got your versions compiled, installed and the basics of ctdb 
> running, you should use the following smb.conf as a starting point :
>
> [global]
>         netbios name = cluster
>         fileid:mapping = fsname
>         gpfs:sharemodes = yes
>         gpfs:leases = yes
>         gpfs:dfreequota = yes
>         gpfs:hsm = yes
>         syncops:onmeta = no
>         kernel oplocks = no
>         level2 oplocks = yes
>         notify:inotify = no
>         vfs objects = shadow_copy2 syncops gpfs fileid
>         shadow:snapdir = .snapshots
>         shadow:fixinodes = yes
>         shadow:snapdirseverywhere = yes
>         shadow:sort = desc
>         wide links = no
>         async smb echo handler = yes
>         smbd:backgroundqueue = False
>         use sendfile = no
>         strict locking = yes
>         posix locking = yes
>         force unknown acl user = yes
>         nfs4:mode = simple
>         nfs4:chown = yes
>         nfs4:acedup = merge
>         nfs4:sidmap = /etc/samba/sidmap.tdb
>         gpfs:winattr = yes
>         store dos attributes = yes
>         map readonly = yes
>         map archive = yes
>         map system = yes
>         map hidden = yes
>         ea support = yes
>         dmapi support = no
>         unix extensions = no
>         socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 
> TCP_KEEPIDLE=240 TCP_KEEPINTVL=15
>         strict allocate = yes
>         gpfs:prealloc = yes
>
>
> if you don't configure using the registry you need to maintain the 
> smb.conf file on all your nodes and i am not diving into how to setup 
> the registry, but for the people who care, michael adam's presentation 
> is a good starting point 
> _http://www.samba.org/~obnox/presentations/linux-kongress-2008/lk2008-obnox.pdf_ 
> <http://www.samba.org/%7Eobnox/presentations/linux-kongress-2008/lk2008-obnox.pdf>
>
> also depending on your authentication/idmap setup , there are multiple 
> changes/additions that needs to be made.
>
> on the gpfs side you should set the following config parameters :
>
> cifsBypassShareLocksOnRename=yes
> syncSambaMetadataOps=yes
> cifsBypassTraversalChecking=yes
> allowSambaCaseInsensitiveLookup=no
>
> your filesystem should have the following settings configured :
>
>  -D       nfs4       File locking semantics in effect
>  -k       nfs4       ACL semantics in effect
>  -o       nfssync      Additional mount options
>  --fastea   Yes    Fast external attributes enabled?
>  -S       relatime     Suppress atime mount option
>
> the -S is a performance optimization, but you need to check if your 
> backup/av or other software can deal with it, it essentially reduces 
> the frequency of atime updates to once every 24 hours which is the new 
> default on nfs mounts and others as well.
>
> a lot of the configuration parameters above are also above and beyond 
> locking and data integrity, they are for better windows compatibility 
> and should be checked if applicable for your environment.
>
> i would also recommend to run on GPFS 3.5 TL3 or newer to get the 
> proper GPFS level of fixes for this type of configurations.
>
> i would like to repeat that i don't write this email to encourage 
> people to all go start installing/ configuring samba on top of GPFS as 
> i pointed out that you are kind on your own unless you can read source 
> code and/or have somebody who does and is able to help as soon as you 
> run into a problem.
> the main point of sharing this information is to clarify a lot of 
> misinformation out there and provide the people who already have 
> setups that are incorrect the information to at least not run into 
> data corruption issues do to wrong configuration.
>
> Sven
>
>
>
>
> From: Chair <chair at gpfsug.org>
> To: gpfsug-discuss at gpfsug.org
> Date: 12/16/2013 03:31 AM
> Subject: Re: [gpfsug-discuss] GPFS and both Samba and NFS
> Sent by: gpfsug-discuss-bounces at gpfsug.org
> ------------------------------------------------------------------------
>
>
>
> Allo
>
>   Just jumping in here a minute:
>
> > It is unworthy of an IBM employee to spread such inaccurate 
> misinformation.
>
> Whilst this may be inaccurate - I very, very, much doubt that IBM or
> their employees have a secret master plan to spread misinformation (!)
> In the spirit of this group, let's work together to technically look at
> such issues.
>
> Sven, if that is the case, perhaps you could crib the lines of code /
> show your methodology that supports your views / experience.
>
>
> Regards,
>
> Jez
> --
> UG Chair
>
>
>
> On 16/12/13 11:21, Jonathan Buzzard wrote:
> > On Fri, 2013-12-13 at 10:14 -0800, Sven Oehme wrote:
> >
> > [SNIP]
> >
> >> the only way to get something working (don't get confused with
> >> officially Supported) is to recompile the CTDB src packages AND the
> >> Samba src packages on a node that has GPFS already installed. also the
> >> inclusion of CTDB into Samba will not address this, its just a more
> >> convenient packaging.
> >>
> >> Only if the build happens on such a node things like the vfs modules
> >> for GPFS are build and included in the package.
> >>
> > That is a factually inaccurate statement. There is nothing in CTDB that
> > is GPFS specific. Trust me I have examined the code closely to determine
> > if this is the case. So unless this has changed recently you are flat
> > out wrong.
> >
> > Consequently there is no requirement whatsoever to rebuild CTDB to get
> > the vfs_gpfs module. In addition there is also no requirement to
> > actually have GPFS installed to build the vfs_gpfs module either. What
> > you need to have is the GPFS GPL header files and nothing else. As it is
> > a loadable VFS module linking takes place at load time not compile time.
> >
> > It is unworthy of an IBM employee to spread such inaccurate
> > misinformation.
> >
> > [SNIP]
> >
> >> said all this the binaries alone are only part of the Solution, after
> >> you have the correct packages, you need to properly configuration the
> >> system and setting all the right options (on GPFS as well as on CTDB
> >> and smbd.conf), which unfortunate are very System configuration
> >> specific, as otherwise you still can end up with data corruption if
> >> not set right.
> > Indeed. However I know not only what those options are, but also what
> > they do despite IBM's refusal to tell us anything about them.
> >
> > I would also point out that there are sites that where running Samba on
> > top of GPFS for many years before IBM began offering their
> > SONAS/Storwize Unifed products.
> >
> >
> > JAB.
> >
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131216/19e7bfde/attachment.htm>

From jonathan at buzzard.me.uk  Mon Dec 16 17:31:29 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 16 Dec 2013 17:31:29 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
	<OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
Message-ID: <1387215089.7230.80.camel@buzzard.phy.strath.ac.uk>

On Mon, 2013-12-16 at 07:31 -0800, Sven Oehme wrote:

[SNIP]

> a few words of clarification, on the controversial point, compile ctdb
> yes/no. 
> while its technically correct, that you don't have to recompile ctdb
> and it has no code that links/depends in any form on GPFS (samba
> does), you still have to compile one if the one shipped with your
> distro is out of date and while i didn't explicitly wrote it this way,
> this is what i intended to say. 
> 

Fair enough, but what you actually said is not remotely close to that.


> the version RH ships is very old and since has gotten many fixes which
> i highly recommend people to use unless the distros will pickup more
> recent version like RHEL7 will do. i probably wouldn't recommend a
> daily git build, but rather the packaged versions that are hosted on
> ftp.samba.org like :
> https://ftp.samba.org/pub/ctdb/packages/redhat/RHEL6/ 
> 
> so the proposed order of things would be to install gpfs, pull the src
> package of ctdb, compile and install ctdb and the devel packages, 

As I pointed out that is a pointless exercise. Just install the
packages.

> then pull a recent samba src package , install all the dependencies
> and build samba on this same host with gpfs and ctdb packages already
> installed. the resulting rpm's should contain the proper code to
> continue. 

I would in the strongest possible terms recommend using a VM for
compiling packages. Anyone compiling and maintaining packages on the
production or even for that matter a test GPFS cluster deserves sacking.

> after you got your versions compiled, installed and the basics of ctdb
> running, you should use the following smb.conf as a starting point : 
> 
> [global] 
>         netbios name = cluster 
>         fileid:mapping = fsname 
>         gpfs:sharemodes = yes 
>         gpfs:leases = yes 
>         gpfs:dfreequota = yes 
>         gpfs:hsm = yes 
>         syncops:onmeta = no 
>         kernel oplocks = no 
>         level2 oplocks = yes 
>         notify:inotify = no 
>         vfs objects = shadow_copy2 syncops gpfs fileid 
>         shadow:snapdir = .snapshots 
>         shadow:fixinodes = yes 
>         shadow:snapdirseverywhere = yes 
>         shadow:sort = desc 
>         wide links = no 
>         async smb echo handler = yes 
>         smbd:backgroundqueue = False 
>         use sendfile = no 
>         strict locking = yes 
>         posix locking = yes 
>         force unknown acl user = yes 
>         nfs4:mode = simple 
>         nfs4:chown = yes 
>         nfs4:acedup = merge 
>         nfs4:sidmap = /etc/samba/sidmap.tdb 
>         gpfs:winattr = yes 
>         store dos attributes = yes 
>         map readonly = yes 
>         map archive = yes 
>         map system = yes 
>         map hidden = yes 
>         ea support = yes 
>         dmapi support = no 
>         unix extensions = no 
>         socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4
> TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 
>         strict allocate = yes 
>         gpfs:prealloc = yes 
> 

Nothing new there for those in the know. That said some of those options
might not be wanted.

> 
> if you don't configure using the registry you need to maintain the
> smb.conf file on all your nodes and i am not diving into how to setup
> the registry, but for the people who care, michael adam's presentation
> is a good starting point
> http://www.samba.org/~obnox/presentations/linux-kongress-2008/lk2008-obnox.pdf 
> 
> also depending on your authentication/idmap setup , there are multiple
> changes/additions that needs to be made. 
> 
> on the gpfs side you should set the following config parameters : 
> 
> cifsBypassShareLocksOnRename=yes 
> syncSambaMetadataOps=yes 
> cifsBypassTraversalChecking=yes 
> allowSambaCaseInsensitiveLookup=no 
> 

The cifsBypassTraversalChecking=yes is very very much a site based
choice. You need to understand what it does because it could create a
security nightmare if you just randomly enable it. On the other hand it
could be a life saver if you are moving from an existing Windows server.

Basically it enables Windows security schematics rather than Posix based
ones. That is you will no longer need access to all the parent folders
to a file in order to be able to access the file. So long as you have
permissions on the file you are good to go.

> your filesystem should have the following settings configured : 
> 
>  -D                 nfs4                     File locking semantics in
> effect 
>  -k                 nfs4                     ACL semantics in effect 

Really I would have expected -k samba :-)

>  -o                 nfssync                  Additional mount options 
>  --fastea           Yes                      Fast external attributes
> enabled? 
>  -S                 relatime                 Suppress atime mount
> option 
> 
> the -S is a performance optimization, but you need to check if your
> backup/av or other software can deal with it, it essentially reduces
> the frequency of atime updates to once every 24 hours which is the new
> default on nfs mounts and others as well. 

How does -S option impact on doing policy based tiering? I would
approach that with caution if you are doing any tiering or HSM.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From jonathan at buzzard.me.uk  Mon Dec 16 20:46:14 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Mon, 16 Dec 2013 20:46:14 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
	<OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
Message-ID: <52AF6696.6090109@buzzard.me.uk>

On 16/12/13 15:31, Sven Oehme wrote:

[SNIP]

>
> i would like to repeat that i don't write this email to encourage people
> to all go start installing/ configuring samba on top of GPFS as i
> pointed out that you are kind on your own unless you can read source
> code and/or have somebody who does and is able to help as soon as you
> run into a problem.
> the main point of sharing this information is to clarify a lot of
> misinformation out there and provide the people who already have setups
> that are incorrect the information to at least not run into data
> corruption issues do to wrong configuration.
>

I meant to say that is *VERY* sound advice. Roughly speaking if you had 
not already worked out just about everything in that email and more 
besides you should not be rolling your own GPFS/Samba/CTDB solution 
because you don't know what you are doing and it will likely end in 
tears at some point.

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From amsterdamos at gmail.com  Tue Dec 17 03:49:44 2013
From: amsterdamos at gmail.com (Adam Wead)
Date: Mon, 16 Dec 2013 22:49:44 -0500
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <52AF6696.6090109@buzzard.me.uk>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
	<OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
	<52AF6696.6090109@buzzard.me.uk>
Message-ID: <CACX3SPJaumvyVAj+yqSPYPBJ8wcf_-hATcqqiA91SP4DoMdvFA@mail.gmail.com>

Hi all,

I've been following this discussion because I use GPFS with both NFS and
Samba.  Although, now I'm a bit concerned because it sounds like this may
not be an appropriate thing to do.  It seems like the issue is the
clustered implementations of Samba and NFS, yes?  Or is this for any
implementation?

I use GPFS 3,4, under RHEL5, but only on one machine.  We do not have a
cluster of GPFS systems, so it's the plain version of NFS and Samba that
comes with RedHat--no CTDB.  My one-node GPFS system exports NFS shares to
which other servers write data, as well as hosts a few Samba shares for
desktop computers.  So now I'm curious.  Is this wrong to do?  I haven't
had any problems in the three years I've had it running.

To clarify, while NFS and Samba are both writing data to the same GPFS
filesystem, they're not writing to the same files or directories.  NFS is
for server-to-server data, and Samba is for desktop clients, which is
relegated to different directories of the same file system.

____________________________________________
Adam Wead
Systems and Digital Collections Librarian
Rock and Roll Hall of Fame and Museum
216.515.1960 (t)
215.515.1964 (f)


On Mon, Dec 16, 2013 at 3:46 PM, Jonathan Buzzard <jonathan at buzzard.me.uk>wrote:

> On 16/12/13 15:31, Sven Oehme wrote:
>
> [SNIP]
>
>
>
>> i would like to repeat that i don't write this email to encourage people
>> to all go start installing/ configuring samba on top of GPFS as i
>> pointed out that you are kind on your own unless you can read source
>> code and/or have somebody who does and is able to help as soon as you
>> run into a problem.
>> the main point of sharing this information is to clarify a lot of
>> misinformation out there and provide the people who already have setups
>> that are incorrect the information to at least not run into data
>> corruption issues do to wrong configuration.
>>
>>
> I meant to say that is *VERY* sound advice. Roughly speaking if you had
> not already worked out just about everything in that email and more besides
> you should not be rolling your own GPFS/Samba/CTDB solution because you
> don't know what you are doing and it will likely end in tears at some point.
>
>
> JAB.
>
> --
> Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
> Fife, United Kingdom.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131216/3c80f10f/attachment.htm>

From jonathan at buzzard.me.uk  Tue Dec 17 21:21:51 2013
From: jonathan at buzzard.me.uk (Jonathan Buzzard)
Date: Tue, 17 Dec 2013 21:21:51 +0000
Subject: [gpfsug-discuss] GPFS and both Samba and NFS
In-Reply-To: <CACX3SPJaumvyVAj+yqSPYPBJ8wcf_-hATcqqiA91SP4DoMdvFA@mail.gmail.com>
References: <CA+Ah3zk3HZfRiMSQHKa-upVWtbtSYHnnTJ_Ahoapf_Wt+1jxSQ@mail.gmail.com>
	<52AB2477.50802@ed.ac.uk>
	<1386948663.3421.13.camel@buzzard.phy.strath.ac.uk>
	<52AB2925.1020809@ed.ac.uk>
	<OFFB317A64.7AC7D23D-ON88257C40.005DFC7E-88257C40.00643A74@us.ibm.com>
	<1387192889.7230.25.camel@buzzard.phy.strath.ac.uk>
	<52AEE44E.1030805@gpfsug.org>
	<OFAB1BB05D.C824E4FE-ON88257C43.004E4293-88257C43.00555132@us.ibm.com>
	<52AF6696.6090109@buzzard.me.uk>
	<CACX3SPJaumvyVAj+yqSPYPBJ8wcf_-hATcqqiA91SP4DoMdvFA@mail.gmail.com>
Message-ID: <52B0C06F.1040007@buzzard.me.uk>

On 17/12/13 03:49, Adam Wead wrote:
> Hi all,
>
> I've been following this discussion because I use GPFS with both NFS and
> Samba.  Although, now I'm a bit concerned because it sounds like this
> may not be an appropriate thing to do.  It seems like the issue is the
> clustered implementations of Samba and NFS, yes?  Or is this for any
> implementation?
>

The issue is that running a GPFS cluster with Samba/NFS etc. on top is 
not in the same league as installing Linux on a single server with some 
storage attached and configuring Samba/NFS.

The system is much more complex and there are nasty ways in which your 
data can get corrupted. To proceed for example without a functional 
clone test system would be fool hardy in the extreme.

For example in my last job I had a functional clone of the live GPFS 
systems. By functional clone that means real hardware, with similar FC 
cards, attached to the same type of storage (in this case LSI/Netapp 
Engenio based) with NSD's of the same type though fewer running the same 
OS image, same multipathing drivers, same GPFS version, same Samba/CTDB 
versions.

In addition I then had a standalone Samba server, same OS, same storage, 
etc. all the same except no GPFS and no CTDB. It was identical apart 
from the vfs_gpfs module. One of the reasons I choose to compile my own 
vfs_gpfs and insert it into pucka RHEL is that I wanted to be able to 
test issues against a known target that I could get support on.

Then for good measure a test real Windows server, because there is 
nothing like being able to rule out problems as being down to the client 
not working properly with SMB.

Finally virtual machines for building my vfs_gpfs modules. If you think 
I am being over the top, with my test platform let me assure you that 
*ALL* of it was absolutely essential for the diagnosis of problems with 
the system and the generation of fixes at one time or another.

The thing is if you don't get this without being told then running a 
GPFS/Samba/CTDB service is really not for you.

Also you need to understand what I call "GPFS and the Dark Arts" aka 
"Magic levers for Samba", what they do and why you might want them. 
There are probably only a handful of people outside IBM who understand 
those, which is why you get warnings from people inside IBM about doing 
it yourself.

So by all means do it, but make sure you have the test systems in place 
and a through understanding of all the technologies involved as you are 
going to have to do support for yourself; you cannot ring IBM or RedHat 
and say my GPFS/Samba/CTDB storage cluster is not working as you are 
well of the beaten path. Sure you can get support from IBM for GPFS 
issues provided it is entirely GPFS related, but saving from Office 2010 
on a shared drive with rich permissions is giving wacked out file 
ownership and permission issues is going to be down to you to fix.

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


From jfosburg at mdanderson.org  Thu Dec 19 17:00:17 2013
From: jfosburg at mdanderson.org (Fosburgh,Jonathan)
Date: Thu, 19 Dec 2013 17:00:17 +0000
Subject: [gpfsug-discuss] Strange performance issue on interface nodes
Message-ID: <CED8823E.2E37D%jfosburg@mdanderson.org>

This morning we started getting complaints that NFS mounts from our GPFS filesystem were hanging.  After investigating I found that our interface nodes (we have 2) had load averages over 800, but there was essentially no CPU usage (>99% idle).  I rebooted the nodes, which restored service, but in less than an hour the load averages have already climbed higher than 70. I have no waiters:

[root at d1prpstg2nsd2 ~]# mmlsnode -N waiters -L 2>/dev/null
[root at d1prpstg2nsd2 ~]#

I also have a lot of nfsd and smbd processes in a 'D' state.  One of my interface servers also shows the following processes:
root     29901  0.0  0.0 105172   620 ?        D<   10:03   0:00 touch /rsrch1/cnfsSharedRoot/.ha/recovery/10.113.115.56/10.113.115.57.tmp
root     30076  0.0  0.0 115688   860 ?        D    10:03   0:00 ls -A -I *.tmp /rsrch1/cnfsSharedRoot/.ha/recovery/10.113.115.56

Those processes have been running almost an hour.

The cluster is running GPFS 3.5.12, there are 6 NSD servers and 2 interface servers.

Does anyone have thoughts as to what is going on?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131219/685438c4/attachment.htm>