From orlando.richards at ed.ac.uk Fri Jan 11 11:51:04 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:51:04 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> Message-ID: <50EFFCA8.8010300@ed.ac.uk> On 27/09/12 18:49, Barry Evans wrote: > Hello, > > Having a little problem with samba at the moment with the gpfs vfs module loaded. In *every* version of enterprise samba 3.6, when browsing folders in windows explorer a good chunk of the files will come back incorrectly with the 'O' attribute (which is offline - you see this quite often when files have been migrated by HSM off to tape, but there isn't any HSM here). > > I have also been through every version of GPFS from 3.4.0-10 up to 3.5.0-3 to make sure it's not actually GPFS causing this. With each version, the behaviour is perfect in enterprise samba 3.5.18. > > if you prevent the gpfs vfs module from loading in samba, the problem also disappears. > > Anyone else run into this? > > Cheers, > Barry > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > Hi Barry, Did you get anywhere with this? I've just run into it too! -- Orlando -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From orlando.richards at ed.ac.uk Fri Jan 11 11:53:01 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:53:01 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFCA8.8010300@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> Message-ID: <50EFFD1D.90601@ed.ac.uk> Oops - never mind, I've found it: https://lists.samba.org/archive/samba-technical/2012-October/087425.html On 11/01/13 11:51, Orlando Richards wrote: > On 27/09/12 18:49, Barry Evans wrote: >> Hello, >> >> Having a little problem with samba at the moment with the gpfs vfs >> module loaded. In *every* version of enterprise samba 3.6, when >> browsing folders in windows explorer a good chunk of the files will >> come back incorrectly with the 'O' attribute (which is offline - you >> see this quite often when files have been migrated by HSM off to tape, >> but there isn't any HSM here). >> >> I have also been through every version of GPFS from 3.4.0-10 up to >> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >> version, the behaviour is perfect in enterprise samba 3.5.18. >> >> if you prevent the gpfs vfs module from loading in samba, the problem >> also disappears. >> >> Anyone else run into this? >> >> Cheers, >> Barry >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > Hi Barry, > > Did you get anywhere with this? I've just run into it too! > > -- > Orlando > > > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Fri Jan 11 11:54:36 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Fri, 11 Jan 2013 11:54:36 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFD1D.90601@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: Too quick! It looks like it's definitely sorted in 3.6.10 if not 3.6.9 Cheers, Barry On 11 Jan 2013, at 11:53, Orlando Richards wrote: > Oops - never mind, I've found it: > > https://lists.samba.org/archive/samba-technical/2012-October/087425.html > > > > On 11/01/13 11:51, Orlando Richards wrote: >> On 27/09/12 18:49, Barry Evans wrote: >>> Hello, >>> >>> Having a little problem with samba at the moment with the gpfs vfs >>> module loaded. In *every* version of enterprise samba 3.6, when >>> browsing folders in windows explorer a good chunk of the files will >>> come back incorrectly with the 'O' attribute (which is offline - you >>> see this quite often when files have been migrated by HSM off to tape, >>> but there isn't any HSM here). >>> >>> I have also been through every version of GPFS from 3.4.0-10 up to >>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>> version, the behaviour is perfect in enterprise samba 3.5.18. >>> >>> if you prevent the gpfs vfs module from loading in samba, the problem >>> also disappears. >>> >>> Anyone else run into this? >>> >>> Cheers, >>> Barry >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> >> Hi Barry, >> >> Did you get anywhere with this? I've just run into it too! >> >> -- >> Orlando >> >> >> > > > -- > -- > Dr Orlando Richards > Information Services > IT Infrastructure Division > Unix Section > Tel: 0131 650 4994 > > The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.fey at sva.de Fri Jan 11 12:03:40 2013 From: christian.fey at sva.de (christian.fey at sva.de) Date: Fri, 11 Jan 2013 13:03:40 +0100 Subject: [gpfsug-discuss] =?iso-8859-1?q?AUTO=3A_Christian_Fey_ist_au=DFer?= =?iso-8859-1?q?_Haus_=28R=FCckkehr_am_25=2E01=2E2013=29?= Message-ID: Ich bin bis 25.01.2013 abwesend. In dringenden F?llen wenden Sie sich bitte an: Lars M?hler (lars.moehler at sva.de), Martina Garland (martina.garland at sva.de) oder hinterlassen mir eine Nachricht auf der Mobilbox. Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht "Re: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute" gesendet am 11.01.2013 12:51:04. Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w?hrend diese Person abwesend ist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From orlando.richards at ed.ac.uk Mon Jan 14 09:09:53 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jan 2013 09:09:53 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: <50F3CB61.1030808@ed.ac.uk> On 11/01/13 11:54, Barry Evans wrote: > Too quick! > > It looks like it's definitely sorted in 3.6.10 if not 3.6.9 With the settings: gpfs:winattr = yes store dos attributes=yes it's all working fine under 3.6.10 from Sernet. Also - snapshots appear to be working properly on 3.6 now, as do dos attributes and SMB2. Whoop! I'm actually considering making the leap from 3.5->3.6 now... > > Cheers, > Barry > > > > > On 11 Jan 2013, at 11:53, Orlando Richards > wrote: > >> Oops - never mind, I've found it: >> >> https://lists.samba.org/archive/samba-technical/2012-October/087425.html >> >> >> >> On 11/01/13 11:51, Orlando Richards wrote: >>> On 27/09/12 18:49, Barry Evans wrote: >>>> Hello, >>>> >>>> Having a little problem with samba at the moment with the gpfs vfs >>>> module loaded. In *every* version of enterprise samba 3.6, when >>>> browsing folders in windows explorer a good chunk of the files will >>>> come back incorrectly with the 'O' attribute (which is offline - you >>>> see this quite often when files have been migrated by HSM off to tape, >>>> but there isn't any HSM here). >>>> >>>> I have also been through every version of GPFS from 3.4.0-10 up to >>>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>>> version, the behaviour is perfect in enterprise samba 3.5.18. >>>> >>>> if you prevent the gpfs vfs module from loading in samba, the problem >>>> also disappears. >>>> >>>> Anyone else run into this? >>>> >>>> Cheers, >>>> Barry >>>> >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at gpfsug.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> Hi Barry, >>> >>> Did you get anywhere with this? I've just run into it too! >>> >>> -- >>> Orlando >>> >>> >>> >> >> >> -- >> -- >> Dr Orlando Richards >> Information Services >> IT Infrastructure Division >> Unix Section >> Tel: 0131 650 4994 >> >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From linda at epcc.ed.ac.uk Wed Jan 16 16:00:15 2013 From: linda at epcc.ed.ac.uk (Linda Dewar) Date: Wed, 16 Jan 2013 16:00:15 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Message-ID: <50F6CE8F.4060900@epcc.ed.ac.uk> Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Wed Jan 16 16:09:37 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Wed, 16 Jan 2013 16:09:37 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jez.Tucker at rushes.co.uk Wed Jan 16 19:58:20 2013 From: Jez.Tucker at rushes.co.uk (Jez Tucker) Date: Wed, 16 Jan 2013 19:58:20 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: Message-ID: <39571EA9316BE44899D59C7A640C13F5306D6327@WARVWEXC1.uk.deluxe-eu.com> I have something amazingly similar with 3.4.0.10-PTF1. My node is on a remote subnet a couple of switches away past a router. Again, ping etc. is all good, no tx/rx errors but node is expelled by local subnet nodes. Some nodes more common than the others. Do you also find that to be the case? I think it could well be worth checking future GPFS version changelogs. Jez From: Barry Evans [mailto:bevans at canditmedia.co.uk] Sent: Wednesday, January 16, 2013 04:09 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Node expulsion from GPFS Cluster Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar > wrote: Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattw at vpac.org Wed Jan 16 21:12:28 2013 From: mattw at vpac.org (Matthew Wallis) Date: Thu, 17 Jan 2013 08:12:28 +1100 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <66FB5812-2F3A-46BE-A9EE-594CAF47AE41@vpac.org> Hi Linda, I might be stating the obvious, but do make sure you check the logs on the node that requested the expulsion. Frequently the master nodes don't get the full details of why a node is to be expelled, they just get the expulsion request, and log the action. If you check the node that made the request, it often has exactly why it made the request, usually a reachability issue, and from that node, you want to do a copy and past of the host it requested expelled, and do a host lookup to make sure it's getting the right IP or interface. Usually at that point it turns out that someone added the expelled node with the wrong interface. Matt. On 17/01/2013, at 3:00 AM, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From j.g.c.monk at dundee.ac.uk Wed Jan 16 16:40:16 2013 From: j.g.c.monk at dundee.ac.uk (Jonathan Monk) Date: Wed, 16 Jan 2013 16:40:16 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <21EC389A5E46094A90D2AE61E7445FFB3468CFC1@DBXPRD0410MB359.eurprd04.prod.outlook.com> We saw something similar with disk leases that was due to load on the manager node but that was with a much simpler config than you describe. Do you have a sample extract from mmfs.log ? Jonathan ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Linda Dewar Sent: Wednesday, January 16, 2013 4:00:15 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The University of Dundee is a registered Scottish Charity, No: SC015096 From orlando.richards at ed.ac.uk Fri Jan 11 11:51:04 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:51:04 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> Message-ID: <50EFFCA8.8010300@ed.ac.uk> On 27/09/12 18:49, Barry Evans wrote: > Hello, > > Having a little problem with samba at the moment with the gpfs vfs module loaded. In *every* version of enterprise samba 3.6, when browsing folders in windows explorer a good chunk of the files will come back incorrectly with the 'O' attribute (which is offline - you see this quite often when files have been migrated by HSM off to tape, but there isn't any HSM here). > > I have also been through every version of GPFS from 3.4.0-10 up to 3.5.0-3 to make sure it's not actually GPFS causing this. With each version, the behaviour is perfect in enterprise samba 3.5.18. > > if you prevent the gpfs vfs module from loading in samba, the problem also disappears. > > Anyone else run into this? > > Cheers, > Barry > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > Hi Barry, Did you get anywhere with this? I've just run into it too! -- Orlando -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From orlando.richards at ed.ac.uk Fri Jan 11 11:53:01 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:53:01 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFCA8.8010300@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> Message-ID: <50EFFD1D.90601@ed.ac.uk> Oops - never mind, I've found it: https://lists.samba.org/archive/samba-technical/2012-October/087425.html On 11/01/13 11:51, Orlando Richards wrote: > On 27/09/12 18:49, Barry Evans wrote: >> Hello, >> >> Having a little problem with samba at the moment with the gpfs vfs >> module loaded. In *every* version of enterprise samba 3.6, when >> browsing folders in windows explorer a good chunk of the files will >> come back incorrectly with the 'O' attribute (which is offline - you >> see this quite often when files have been migrated by HSM off to tape, >> but there isn't any HSM here). >> >> I have also been through every version of GPFS from 3.4.0-10 up to >> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >> version, the behaviour is perfect in enterprise samba 3.5.18. >> >> if you prevent the gpfs vfs module from loading in samba, the problem >> also disappears. >> >> Anyone else run into this? >> >> Cheers, >> Barry >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > Hi Barry, > > Did you get anywhere with this? I've just run into it too! > > -- > Orlando > > > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Fri Jan 11 11:54:36 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Fri, 11 Jan 2013 11:54:36 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFD1D.90601@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: Too quick! It looks like it's definitely sorted in 3.6.10 if not 3.6.9 Cheers, Barry On 11 Jan 2013, at 11:53, Orlando Richards wrote: > Oops - never mind, I've found it: > > https://lists.samba.org/archive/samba-technical/2012-October/087425.html > > > > On 11/01/13 11:51, Orlando Richards wrote: >> On 27/09/12 18:49, Barry Evans wrote: >>> Hello, >>> >>> Having a little problem with samba at the moment with the gpfs vfs >>> module loaded. In *every* version of enterprise samba 3.6, when >>> browsing folders in windows explorer a good chunk of the files will >>> come back incorrectly with the 'O' attribute (which is offline - you >>> see this quite often when files have been migrated by HSM off to tape, >>> but there isn't any HSM here). >>> >>> I have also been through every version of GPFS from 3.4.0-10 up to >>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>> version, the behaviour is perfect in enterprise samba 3.5.18. >>> >>> if you prevent the gpfs vfs module from loading in samba, the problem >>> also disappears. >>> >>> Anyone else run into this? >>> >>> Cheers, >>> Barry >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> >> Hi Barry, >> >> Did you get anywhere with this? I've just run into it too! >> >> -- >> Orlando >> >> >> > > > -- > -- > Dr Orlando Richards > Information Services > IT Infrastructure Division > Unix Section > Tel: 0131 650 4994 > > The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.fey at sva.de Fri Jan 11 12:03:40 2013 From: christian.fey at sva.de (christian.fey at sva.de) Date: Fri, 11 Jan 2013 13:03:40 +0100 Subject: [gpfsug-discuss] =?iso-8859-1?q?AUTO=3A_Christian_Fey_ist_au=DFer?= =?iso-8859-1?q?_Haus_=28R=FCckkehr_am_25=2E01=2E2013=29?= Message-ID: Ich bin bis 25.01.2013 abwesend. In dringenden F?llen wenden Sie sich bitte an: Lars M?hler (lars.moehler at sva.de), Martina Garland (martina.garland at sva.de) oder hinterlassen mir eine Nachricht auf der Mobilbox. Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht "Re: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute" gesendet am 11.01.2013 12:51:04. Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w?hrend diese Person abwesend ist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From orlando.richards at ed.ac.uk Mon Jan 14 09:09:53 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jan 2013 09:09:53 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: <50F3CB61.1030808@ed.ac.uk> On 11/01/13 11:54, Barry Evans wrote: > Too quick! > > It looks like it's definitely sorted in 3.6.10 if not 3.6.9 With the settings: gpfs:winattr = yes store dos attributes=yes it's all working fine under 3.6.10 from Sernet. Also - snapshots appear to be working properly on 3.6 now, as do dos attributes and SMB2. Whoop! I'm actually considering making the leap from 3.5->3.6 now... > > Cheers, > Barry > > > > > On 11 Jan 2013, at 11:53, Orlando Richards > wrote: > >> Oops - never mind, I've found it: >> >> https://lists.samba.org/archive/samba-technical/2012-October/087425.html >> >> >> >> On 11/01/13 11:51, Orlando Richards wrote: >>> On 27/09/12 18:49, Barry Evans wrote: >>>> Hello, >>>> >>>> Having a little problem with samba at the moment with the gpfs vfs >>>> module loaded. In *every* version of enterprise samba 3.6, when >>>> browsing folders in windows explorer a good chunk of the files will >>>> come back incorrectly with the 'O' attribute (which is offline - you >>>> see this quite often when files have been migrated by HSM off to tape, >>>> but there isn't any HSM here). >>>> >>>> I have also been through every version of GPFS from 3.4.0-10 up to >>>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>>> version, the behaviour is perfect in enterprise samba 3.5.18. >>>> >>>> if you prevent the gpfs vfs module from loading in samba, the problem >>>> also disappears. >>>> >>>> Anyone else run into this? >>>> >>>> Cheers, >>>> Barry >>>> >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at gpfsug.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> Hi Barry, >>> >>> Did you get anywhere with this? I've just run into it too! >>> >>> -- >>> Orlando >>> >>> >>> >> >> >> -- >> -- >> Dr Orlando Richards >> Information Services >> IT Infrastructure Division >> Unix Section >> Tel: 0131 650 4994 >> >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From linda at epcc.ed.ac.uk Wed Jan 16 16:00:15 2013 From: linda at epcc.ed.ac.uk (Linda Dewar) Date: Wed, 16 Jan 2013 16:00:15 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Message-ID: <50F6CE8F.4060900@epcc.ed.ac.uk> Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Wed Jan 16 16:09:37 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Wed, 16 Jan 2013 16:09:37 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jez.Tucker at rushes.co.uk Wed Jan 16 19:58:20 2013 From: Jez.Tucker at rushes.co.uk (Jez Tucker) Date: Wed, 16 Jan 2013 19:58:20 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: Message-ID: <39571EA9316BE44899D59C7A640C13F5306D6327@WARVWEXC1.uk.deluxe-eu.com> I have something amazingly similar with 3.4.0.10-PTF1. My node is on a remote subnet a couple of switches away past a router. Again, ping etc. is all good, no tx/rx errors but node is expelled by local subnet nodes. Some nodes more common than the others. Do you also find that to be the case? I think it could well be worth checking future GPFS version changelogs. Jez From: Barry Evans [mailto:bevans at canditmedia.co.uk] Sent: Wednesday, January 16, 2013 04:09 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Node expulsion from GPFS Cluster Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar > wrote: Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattw at vpac.org Wed Jan 16 21:12:28 2013 From: mattw at vpac.org (Matthew Wallis) Date: Thu, 17 Jan 2013 08:12:28 +1100 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <66FB5812-2F3A-46BE-A9EE-594CAF47AE41@vpac.org> Hi Linda, I might be stating the obvious, but do make sure you check the logs on the node that requested the expulsion. Frequently the master nodes don't get the full details of why a node is to be expelled, they just get the expulsion request, and log the action. If you check the node that made the request, it often has exactly why it made the request, usually a reachability issue, and from that node, you want to do a copy and past of the host it requested expelled, and do a host lookup to make sure it's getting the right IP or interface. Usually at that point it turns out that someone added the expelled node with the wrong interface. Matt. On 17/01/2013, at 3:00 AM, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From j.g.c.monk at dundee.ac.uk Wed Jan 16 16:40:16 2013 From: j.g.c.monk at dundee.ac.uk (Jonathan Monk) Date: Wed, 16 Jan 2013 16:40:16 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <21EC389A5E46094A90D2AE61E7445FFB3468CFC1@DBXPRD0410MB359.eurprd04.prod.outlook.com> We saw something similar with disk leases that was due to load on the manager node but that was with a much simpler config than you describe. Do you have a sample extract from mmfs.log ? Jonathan ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Linda Dewar Sent: Wednesday, January 16, 2013 4:00:15 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The University of Dundee is a registered Scottish Charity, No: SC015096 From orlando.richards at ed.ac.uk Fri Jan 11 11:51:04 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:51:04 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> Message-ID: <50EFFCA8.8010300@ed.ac.uk> On 27/09/12 18:49, Barry Evans wrote: > Hello, > > Having a little problem with samba at the moment with the gpfs vfs module loaded. In *every* version of enterprise samba 3.6, when browsing folders in windows explorer a good chunk of the files will come back incorrectly with the 'O' attribute (which is offline - you see this quite often when files have been migrated by HSM off to tape, but there isn't any HSM here). > > I have also been through every version of GPFS from 3.4.0-10 up to 3.5.0-3 to make sure it's not actually GPFS causing this. With each version, the behaviour is perfect in enterprise samba 3.5.18. > > if you prevent the gpfs vfs module from loading in samba, the problem also disappears. > > Anyone else run into this? > > Cheers, > Barry > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > Hi Barry, Did you get anywhere with this? I've just run into it too! -- Orlando -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From orlando.richards at ed.ac.uk Fri Jan 11 11:53:01 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:53:01 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFCA8.8010300@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> Message-ID: <50EFFD1D.90601@ed.ac.uk> Oops - never mind, I've found it: https://lists.samba.org/archive/samba-technical/2012-October/087425.html On 11/01/13 11:51, Orlando Richards wrote: > On 27/09/12 18:49, Barry Evans wrote: >> Hello, >> >> Having a little problem with samba at the moment with the gpfs vfs >> module loaded. In *every* version of enterprise samba 3.6, when >> browsing folders in windows explorer a good chunk of the files will >> come back incorrectly with the 'O' attribute (which is offline - you >> see this quite often when files have been migrated by HSM off to tape, >> but there isn't any HSM here). >> >> I have also been through every version of GPFS from 3.4.0-10 up to >> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >> version, the behaviour is perfect in enterprise samba 3.5.18. >> >> if you prevent the gpfs vfs module from loading in samba, the problem >> also disappears. >> >> Anyone else run into this? >> >> Cheers, >> Barry >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > Hi Barry, > > Did you get anywhere with this? I've just run into it too! > > -- > Orlando > > > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Fri Jan 11 11:54:36 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Fri, 11 Jan 2013 11:54:36 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFD1D.90601@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: Too quick! It looks like it's definitely sorted in 3.6.10 if not 3.6.9 Cheers, Barry On 11 Jan 2013, at 11:53, Orlando Richards wrote: > Oops - never mind, I've found it: > > https://lists.samba.org/archive/samba-technical/2012-October/087425.html > > > > On 11/01/13 11:51, Orlando Richards wrote: >> On 27/09/12 18:49, Barry Evans wrote: >>> Hello, >>> >>> Having a little problem with samba at the moment with the gpfs vfs >>> module loaded. In *every* version of enterprise samba 3.6, when >>> browsing folders in windows explorer a good chunk of the files will >>> come back incorrectly with the 'O' attribute (which is offline - you >>> see this quite often when files have been migrated by HSM off to tape, >>> but there isn't any HSM here). >>> >>> I have also been through every version of GPFS from 3.4.0-10 up to >>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>> version, the behaviour is perfect in enterprise samba 3.5.18. >>> >>> if you prevent the gpfs vfs module from loading in samba, the problem >>> also disappears. >>> >>> Anyone else run into this? >>> >>> Cheers, >>> Barry >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> >> Hi Barry, >> >> Did you get anywhere with this? I've just run into it too! >> >> -- >> Orlando >> >> >> > > > -- > -- > Dr Orlando Richards > Information Services > IT Infrastructure Division > Unix Section > Tel: 0131 650 4994 > > The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.fey at sva.de Fri Jan 11 12:03:40 2013 From: christian.fey at sva.de (christian.fey at sva.de) Date: Fri, 11 Jan 2013 13:03:40 +0100 Subject: [gpfsug-discuss] =?iso-8859-1?q?AUTO=3A_Christian_Fey_ist_au=DFer?= =?iso-8859-1?q?_Haus_=28R=FCckkehr_am_25=2E01=2E2013=29?= Message-ID: Ich bin bis 25.01.2013 abwesend. In dringenden F?llen wenden Sie sich bitte an: Lars M?hler (lars.moehler at sva.de), Martina Garland (martina.garland at sva.de) oder hinterlassen mir eine Nachricht auf der Mobilbox. Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht "Re: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute" gesendet am 11.01.2013 12:51:04. Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w?hrend diese Person abwesend ist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From orlando.richards at ed.ac.uk Mon Jan 14 09:09:53 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jan 2013 09:09:53 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: <50F3CB61.1030808@ed.ac.uk> On 11/01/13 11:54, Barry Evans wrote: > Too quick! > > It looks like it's definitely sorted in 3.6.10 if not 3.6.9 With the settings: gpfs:winattr = yes store dos attributes=yes it's all working fine under 3.6.10 from Sernet. Also - snapshots appear to be working properly on 3.6 now, as do dos attributes and SMB2. Whoop! I'm actually considering making the leap from 3.5->3.6 now... > > Cheers, > Barry > > > > > On 11 Jan 2013, at 11:53, Orlando Richards > wrote: > >> Oops - never mind, I've found it: >> >> https://lists.samba.org/archive/samba-technical/2012-October/087425.html >> >> >> >> On 11/01/13 11:51, Orlando Richards wrote: >>> On 27/09/12 18:49, Barry Evans wrote: >>>> Hello, >>>> >>>> Having a little problem with samba at the moment with the gpfs vfs >>>> module loaded. In *every* version of enterprise samba 3.6, when >>>> browsing folders in windows explorer a good chunk of the files will >>>> come back incorrectly with the 'O' attribute (which is offline - you >>>> see this quite often when files have been migrated by HSM off to tape, >>>> but there isn't any HSM here). >>>> >>>> I have also been through every version of GPFS from 3.4.0-10 up to >>>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>>> version, the behaviour is perfect in enterprise samba 3.5.18. >>>> >>>> if you prevent the gpfs vfs module from loading in samba, the problem >>>> also disappears. >>>> >>>> Anyone else run into this? >>>> >>>> Cheers, >>>> Barry >>>> >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at gpfsug.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> Hi Barry, >>> >>> Did you get anywhere with this? I've just run into it too! >>> >>> -- >>> Orlando >>> >>> >>> >> >> >> -- >> -- >> Dr Orlando Richards >> Information Services >> IT Infrastructure Division >> Unix Section >> Tel: 0131 650 4994 >> >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From linda at epcc.ed.ac.uk Wed Jan 16 16:00:15 2013 From: linda at epcc.ed.ac.uk (Linda Dewar) Date: Wed, 16 Jan 2013 16:00:15 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Message-ID: <50F6CE8F.4060900@epcc.ed.ac.uk> Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Wed Jan 16 16:09:37 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Wed, 16 Jan 2013 16:09:37 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jez.Tucker at rushes.co.uk Wed Jan 16 19:58:20 2013 From: Jez.Tucker at rushes.co.uk (Jez Tucker) Date: Wed, 16 Jan 2013 19:58:20 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: Message-ID: <39571EA9316BE44899D59C7A640C13F5306D6327@WARVWEXC1.uk.deluxe-eu.com> I have something amazingly similar with 3.4.0.10-PTF1. My node is on a remote subnet a couple of switches away past a router. Again, ping etc. is all good, no tx/rx errors but node is expelled by local subnet nodes. Some nodes more common than the others. Do you also find that to be the case? I think it could well be worth checking future GPFS version changelogs. Jez From: Barry Evans [mailto:bevans at canditmedia.co.uk] Sent: Wednesday, January 16, 2013 04:09 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Node expulsion from GPFS Cluster Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar > wrote: Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattw at vpac.org Wed Jan 16 21:12:28 2013 From: mattw at vpac.org (Matthew Wallis) Date: Thu, 17 Jan 2013 08:12:28 +1100 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <66FB5812-2F3A-46BE-A9EE-594CAF47AE41@vpac.org> Hi Linda, I might be stating the obvious, but do make sure you check the logs on the node that requested the expulsion. Frequently the master nodes don't get the full details of why a node is to be expelled, they just get the expulsion request, and log the action. If you check the node that made the request, it often has exactly why it made the request, usually a reachability issue, and from that node, you want to do a copy and past of the host it requested expelled, and do a host lookup to make sure it's getting the right IP or interface. Usually at that point it turns out that someone added the expelled node with the wrong interface. Matt. On 17/01/2013, at 3:00 AM, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From j.g.c.monk at dundee.ac.uk Wed Jan 16 16:40:16 2013 From: j.g.c.monk at dundee.ac.uk (Jonathan Monk) Date: Wed, 16 Jan 2013 16:40:16 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <21EC389A5E46094A90D2AE61E7445FFB3468CFC1@DBXPRD0410MB359.eurprd04.prod.outlook.com> We saw something similar with disk leases that was due to load on the manager node but that was with a much simpler config than you describe. Do you have a sample extract from mmfs.log ? Jonathan ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Linda Dewar Sent: Wednesday, January 16, 2013 4:00:15 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The University of Dundee is a registered Scottish Charity, No: SC015096 From orlando.richards at ed.ac.uk Fri Jan 11 11:51:04 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:51:04 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> Message-ID: <50EFFCA8.8010300@ed.ac.uk> On 27/09/12 18:49, Barry Evans wrote: > Hello, > > Having a little problem with samba at the moment with the gpfs vfs module loaded. In *every* version of enterprise samba 3.6, when browsing folders in windows explorer a good chunk of the files will come back incorrectly with the 'O' attribute (which is offline - you see this quite often when files have been migrated by HSM off to tape, but there isn't any HSM here). > > I have also been through every version of GPFS from 3.4.0-10 up to 3.5.0-3 to make sure it's not actually GPFS causing this. With each version, the behaviour is perfect in enterprise samba 3.5.18. > > if you prevent the gpfs vfs module from loading in samba, the problem also disappears. > > Anyone else run into this? > > Cheers, > Barry > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > Hi Barry, Did you get anywhere with this? I've just run into it too! -- Orlando -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From orlando.richards at ed.ac.uk Fri Jan 11 11:53:01 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Fri, 11 Jan 2013 11:53:01 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFCA8.8010300@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> Message-ID: <50EFFD1D.90601@ed.ac.uk> Oops - never mind, I've found it: https://lists.samba.org/archive/samba-technical/2012-October/087425.html On 11/01/13 11:51, Orlando Richards wrote: > On 27/09/12 18:49, Barry Evans wrote: >> Hello, >> >> Having a little problem with samba at the moment with the gpfs vfs >> module loaded. In *every* version of enterprise samba 3.6, when >> browsing folders in windows explorer a good chunk of the files will >> come back incorrectly with the 'O' attribute (which is offline - you >> see this quite often when files have been migrated by HSM off to tape, >> but there isn't any HSM here). >> >> I have also been through every version of GPFS from 3.4.0-10 up to >> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >> version, the behaviour is perfect in enterprise samba 3.5.18. >> >> if you prevent the gpfs vfs module from loading in samba, the problem >> also disappears. >> >> Anyone else run into this? >> >> Cheers, >> Barry >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > Hi Barry, > > Did you get anywhere with this? I've just run into it too! > > -- > Orlando > > > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Fri Jan 11 11:54:36 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Fri, 11 Jan 2013 11:54:36 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: <50EFFD1D.90601@ed.ac.uk> References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: Too quick! It looks like it's definitely sorted in 3.6.10 if not 3.6.9 Cheers, Barry On 11 Jan 2013, at 11:53, Orlando Richards wrote: > Oops - never mind, I've found it: > > https://lists.samba.org/archive/samba-technical/2012-October/087425.html > > > > On 11/01/13 11:51, Orlando Richards wrote: >> On 27/09/12 18:49, Barry Evans wrote: >>> Hello, >>> >>> Having a little problem with samba at the moment with the gpfs vfs >>> module loaded. In *every* version of enterprise samba 3.6, when >>> browsing folders in windows explorer a good chunk of the files will >>> come back incorrectly with the 'O' attribute (which is offline - you >>> see this quite often when files have been migrated by HSM off to tape, >>> but there isn't any HSM here). >>> >>> I have also been through every version of GPFS from 3.4.0-10 up to >>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>> version, the behaviour is perfect in enterprise samba 3.5.18. >>> >>> if you prevent the gpfs vfs module from loading in samba, the problem >>> also disappears. >>> >>> Anyone else run into this? >>> >>> Cheers, >>> Barry >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> >> Hi Barry, >> >> Did you get anywhere with this? I've just run into it too! >> >> -- >> Orlando >> >> >> > > > -- > -- > Dr Orlando Richards > Information Services > IT Infrastructure Division > Unix Section > Tel: 0131 650 4994 > > The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.fey at sva.de Fri Jan 11 12:03:40 2013 From: christian.fey at sva.de (christian.fey at sva.de) Date: Fri, 11 Jan 2013 13:03:40 +0100 Subject: [gpfsug-discuss] =?iso-8859-1?q?AUTO=3A_Christian_Fey_ist_au=DFer?= =?iso-8859-1?q?_Haus_=28R=FCckkehr_am_25=2E01=2E2013=29?= Message-ID: Ich bin bis 25.01.2013 abwesend. In dringenden F?llen wenden Sie sich bitte an: Lars M?hler (lars.moehler at sva.de), Martina Garland (martina.garland at sva.de) oder hinterlassen mir eine Nachricht auf der Mobilbox. Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht "Re: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute" gesendet am 11.01.2013 12:51:04. Diese ist die einzige Benachrichtigung, die Sie empfangen werden, w?hrend diese Person abwesend ist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From orlando.richards at ed.ac.uk Mon Jan 14 09:09:53 2013 From: orlando.richards at ed.ac.uk (Orlando Richards) Date: Mon, 14 Jan 2013 09:09:53 +0000 Subject: [gpfsug-discuss] Samba 3.6 and the Windows 'O' attribtute In-Reply-To: References: <520271A3-9C81-45AB-AD55-8CD3708C5730@canditmedia.co.uk> <50EFFCA8.8010300@ed.ac.uk> <50EFFD1D.90601@ed.ac.uk> Message-ID: <50F3CB61.1030808@ed.ac.uk> On 11/01/13 11:54, Barry Evans wrote: > Too quick! > > It looks like it's definitely sorted in 3.6.10 if not 3.6.9 With the settings: gpfs:winattr = yes store dos attributes=yes it's all working fine under 3.6.10 from Sernet. Also - snapshots appear to be working properly on 3.6 now, as do dos attributes and SMB2. Whoop! I'm actually considering making the leap from 3.5->3.6 now... > > Cheers, > Barry > > > > > On 11 Jan 2013, at 11:53, Orlando Richards > wrote: > >> Oops - never mind, I've found it: >> >> https://lists.samba.org/archive/samba-technical/2012-October/087425.html >> >> >> >> On 11/01/13 11:51, Orlando Richards wrote: >>> On 27/09/12 18:49, Barry Evans wrote: >>>> Hello, >>>> >>>> Having a little problem with samba at the moment with the gpfs vfs >>>> module loaded. In *every* version of enterprise samba 3.6, when >>>> browsing folders in windows explorer a good chunk of the files will >>>> come back incorrectly with the 'O' attribute (which is offline - you >>>> see this quite often when files have been migrated by HSM off to tape, >>>> but there isn't any HSM here). >>>> >>>> I have also been through every version of GPFS from 3.4.0-10 up to >>>> 3.5.0-3 to make sure it's not actually GPFS causing this. With each >>>> version, the behaviour is perfect in enterprise samba 3.5.18. >>>> >>>> if you prevent the gpfs vfs module from loading in samba, the problem >>>> also disappears. >>>> >>>> Anyone else run into this? >>>> >>>> Cheers, >>>> Barry >>>> >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at gpfsug.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>> >>> Hi Barry, >>> >>> Did you get anywhere with this? I've just run into it too! >>> >>> -- >>> Orlando >>> >>> >>> >> >> >> -- >> -- >> Dr Orlando Richards >> Information Services >> IT Infrastructure Division >> Unix Section >> Tel: 0131 650 4994 >> >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > -- -- Dr Orlando Richards Information Services IT Infrastructure Division Unix Section Tel: 0131 650 4994 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From linda at epcc.ed.ac.uk Wed Jan 16 16:00:15 2013 From: linda at epcc.ed.ac.uk (Linda Dewar) Date: Wed, 16 Jan 2013 16:00:15 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Message-ID: <50F6CE8F.4060900@epcc.ed.ac.uk> Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From bevans at canditmedia.co.uk Wed Jan 16 16:09:37 2013 From: bevans at canditmedia.co.uk (Barry Evans) Date: Wed, 16 Jan 2013 16:09:37 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jez.Tucker at rushes.co.uk Wed Jan 16 19:58:20 2013 From: Jez.Tucker at rushes.co.uk (Jez Tucker) Date: Wed, 16 Jan 2013 19:58:20 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: Message-ID: <39571EA9316BE44899D59C7A640C13F5306D6327@WARVWEXC1.uk.deluxe-eu.com> I have something amazingly similar with 3.4.0.10-PTF1. My node is on a remote subnet a couple of switches away past a router. Again, ping etc. is all good, no tx/rx errors but node is expelled by local subnet nodes. Some nodes more common than the others. Do you also find that to be the case? I think it could well be worth checking future GPFS version changelogs. Jez From: Barry Evans [mailto:bevans at canditmedia.co.uk] Sent: Wednesday, January 16, 2013 04:09 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Node expulsion from GPFS Cluster Hi Linda, A couple of questions - * How many clusters in total? * How many subnets in total and if more than 2 are they all fully routable to each other? * Do you run any firewall on the server or clients that would be blocking 1191 from/to anywhere? Cheers, Barry On 16 Jan 2013, at 16:00, Linda Dewar > wrote: Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattw at vpac.org Wed Jan 16 21:12:28 2013 From: mattw at vpac.org (Matthew Wallis) Date: Thu, 17 Jan 2013 08:12:28 +1100 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <66FB5812-2F3A-46BE-A9EE-594CAF47AE41@vpac.org> Hi Linda, I might be stating the obvious, but do make sure you check the logs on the node that requested the expulsion. Frequently the master nodes don't get the full details of why a node is to be expelled, they just get the expulsion request, and log the action. If you check the node that made the request, it often has exactly why it made the request, usually a reachability issue, and from that node, you want to do a copy and past of the host it requested expelled, and do a host lookup to make sure it's getting the right IP or interface. Usually at that point it turns out that someone added the expelled node with the wrong interface. Matt. On 17/01/2013, at 3:00 AM, Linda Dewar wrote: > Hello, > > Just wondering if anyone else has seen anything like this. > > I am using GPFS 3.4.0.7. > > I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. > > All clients in both clusters are connected to a IBM BNT G8200 switch > > Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. > > Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. > > Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? > > > Many thanks, > > Linda > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From j.g.c.monk at dundee.ac.uk Wed Jan 16 16:40:16 2013 From: j.g.c.monk at dundee.ac.uk (Jonathan Monk) Date: Wed, 16 Jan 2013 16:40:16 +0000 Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster In-Reply-To: <50F6CE8F.4060900@epcc.ed.ac.uk> References: <50F6CE8F.4060900@epcc.ed.ac.uk> Message-ID: <21EC389A5E46094A90D2AE61E7445FFB3468CFC1@DBXPRD0410MB359.eurprd04.prod.outlook.com> We saw something similar with disk leases that was due to load on the manager node but that was with a much simpler config than you describe. Do you have a sample extract from mmfs.log ? Jonathan ________________________________________ From: gpfsug-discuss-bounces at gpfsug.org on behalf of Linda Dewar Sent: Wednesday, January 16, 2013 4:00:15 PM To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] Node expulsion from GPFS Cluster Hello, Just wondering if anyone else has seen anything like this. I am using GPFS 3.4.0.7. I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster. All clients in both clusters are connected to a IBM BNT G8200 switch Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users. Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem. Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for? Many thanks, Linda -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The University of Dundee is a registered Scottish Charity, No: SC015096