From r.sobey at imperial.ac.uk  Fri Sep  1 09:45:24 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 08:45:24 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/12dc1c48/attachment.htm>

From bart.vandamme at sdnsquare.com  Fri Sep  1 10:30:59 2017
From: bart.vandamme at sdnsquare.com (Bart Van Damme)
Date: Fri, 1 Sep 2017 11:30:59 +0200
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
Message-ID: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>

We are a company located in Belgium that mainly implements spectrum scale
clusters in the Media and broadcasting industry.

Currently we have a customer who wants to export the scale file system over
samba 4.5 and 4.6.
In these versions the SMB2 leases are activated by default for enhancing
the oplocks system.

The problem is when this option is not disabled Adobe (and probably
Windows) is not notified the size of the file have changed, resulting that
reading growing file in Adobe is not working, the timeline is not updated.

Does anybody had this issues before and know how to solve it.


This is the smb.conf file:


============================

# Global options

smb2 leases = yes

client use spnego = yes

clustering = yes

unix extensions = no

mangled names = no

ea support = yes

store dos attributes = yes

map readonly = no

map archive = yes

map system = no

force unknown acl user = yes

obey pam restrictions = no

deadtime = 480

disable netbios = yes

server signing = disabled

server min protocol = SMB2

smb encrypt = off

# We do not allow guest usage.

guest ok = no

guest account = nobody

map to guest = bad user

# disable printing

load printers = no

printing = bsd

printcap name = /dev/null

disable spoolss = yes

# log settings

log file = /var/log/samba/log.%m

# max 500KB per log file, then rotate

max log size = 500

log level = 1 passdb:1 auth:1 winbind:1  idmap:1

#============ Share Definitions ============

[pfs]

comment = GPFS

path = /gpfs/pfs

valid users = @ug_numpr

writeable = yes

inherit permissions = yes

create mask = 664

force create mode = 664

nfs4:chown = yes

nfs4:acedup = merge

nfs4:mode = special

fileid:algorithm = fsname

vfs objects = shadow_copy2 gpfs fileid full_audit

full_audit:prefix = %u|%I|%m|%S

full_audit:success = rename unlink rmdir

full_audit:failure = none

full_audit:facility = local6

full_audit:priority = NOTICE

shadow:fixinodes = yes

gpfs:sharemodes = yes

gpfs:winattr = yes

gpfs:leases = no

locking = yes

posix locking = yes

oplocks = yes

kernel oplocks = no


Grtz,

Bart

*Bart Van Damme *

*Customer Project Manager*

*SDNsquare*
Technologiepark 3,
9052 Zwijnaarde, Belgium
www.sdnsquare.com

T:  + 32 9 241 56 01
<09%20241%2056%2001>
M: + 32 496 59 23 09


*This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.*

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virusvrij.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/873bf698/attachment.htm>

From r.sobey at imperial.ac.uk  Fri Sep  1 14:36:56 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 13:36:56 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
In-Reply-To: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225B01281871606C60C6333DF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/2a4162e9/attachment.htm>

From ewahl at osc.edu  Fri Sep  1 21:56:25 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Fri, 1 Sep 2017 16:56:25 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c@osc.edu>

Howdy.   Just noticed this change to min RDMA packet size and I don't seem to
see it in any patch notes.  Maybe I just skipped the one where this changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384 

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix. 
Current GPFS build: "4.2.2.3 efix21 (1028007)".

-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From akers at vt.edu  Fri Sep  1 22:06:15 2017
From: akers at vt.edu (Joshua Akers)
Date: Fri, 01 Sep 2017 21:06:15 +0000
Subject: [gpfsug-discuss] Quorum managers
Message-ID: <CAHO5rBG+PkntpshV105j54+O4CtcDXqQCb9AJutq-s_PEN0g3A@mail.gmail.com>

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
-- 
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/a49947db/attachment.htm>

From oehmes at gmail.com  Fri Sep  1 23:42:55 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 01 Sep 2017 22:42:55 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <20170901165625.6e4edd4c@osc.edu>
References: <20170901165625.6e4edd4c@osc.edu>
Message-ID: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf

hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/b75cfc74/attachment.htm>

From truongv at us.ibm.com  Fri Sep  1 23:56:23 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Fri, 1 Sep 2017 18:56:23 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <OF5FAAEAFD.EA16DDD3-ON8525818E.007DC7B6-8525818E.007E031F@notes.na.collabserv.com>


The discrepancy between the mmlsconfig view and mmdiag has been fixed in
GFPS 4.2.3 version.  Note, mmdiag reports the correct default value.

Tru.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/01/2017 06:43 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 2
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: GPFS GUI Nodes > NSD no data (Sobey, Richard A)
   2. Change to default for verbsRdmaMinBytes? (Edward Wahl)
   3. Quorum managers (Joshua Akers)
   4. Re: Change to default for verbsRdmaMinBytes? (Sven Oehme)


----------------------------------------------------------------------

Message: 1
Date: Fri, 1 Sep 2017 13:36:56 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID:

<HE1PR0602MB3225B01281871606C60C6333DF920 at HE1PR0602MB3225.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey,
Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD
Server Nodes, the only columns with good data are Name, State and NSD
Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a
while, I don't think I've seen it work since 4.2.1 which was the first time
I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_2a4162e9_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=jcPGl5zwtQFMbnEmBpNErsD43uwoVeKgKk_8j7ZeCJY&e=
 >

------------------------------

Message: 2
Date: Fri, 1 Sep 2017 16:56:25 -0400
From: Edward Wahl <ewahl at osc.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c at osc.edu>
Content-Type: text/plain; charset="US-ASCII"

Howdy.   Just noticed this change to min RDMA packet size and I don't seem
to
see it in any patch notes.  Maybe I just skipped the one where this
changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things
before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix.
Current GPFS build: "4.2.2.3 efix21 (1028007)".

--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


------------------------------

Message: 3
Date: Fri, 01 Sep 2017 21:06:15 +0000
From: Joshua Akers <akers at vt.edu>
To: "gpfsug-discuss at spectrumscale.org"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Quorum managers
Message-ID:
		 <CAHO5rBG+PkntpshV105j54
+O4CtcDXqQCb9AJutq-s_PEN0g3A at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
--
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_a49947db_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=Gag0raQbp7KZAyINlnmuxlnpjboo9XOWO3dDL2HCsZo&e=
 >

------------------------------

Message: 4
Date: Fri, 01 Sep 2017 22:42:55 +0000
From: Sven Oehme <oehmes at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID:
		 <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=
+czkxMP3xCMg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
https://urldefense.proofpoint.com/v2/url?u=http-3A__files.gpfsug.org_presentations_2017_Manchester_08-5FResearch-5FTopics.pdf&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=8c_55Ld_iAC2sr_QU0cyGiOiyU7Z9NjcVknVuRpRIlk&e=


hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't
seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_b75cfc74_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=LpVpXMgqE_LD-t_J7yfNwURUrdUR29TzWvjVTi18kpA&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=


End of gpfsug-discuss Digest, Vol 68, Issue 2
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment.gif>

From r.sobey at imperial.ac.uk  Sat Sep  2 10:35:34 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Sat, 2 Sep 2017 09:35:34 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/4f65f336/attachment.htm>

From truongv at us.ibm.com  Sat Sep  2 12:40:15 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Sat, 2 Sep 2017 07:40:15 -0400
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
References: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFD4B893A8.5AF018F4-ON8525818F.003EF746-8525818F.00401C38@notes.na.collabserv.com>


The dates that have the zone abbreviation are from the scripts which use
the OS date command.  The daemon has its own format.  This inconsistency
has been address in 4.2.2.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/02/2017 07:00 AM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 4
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Date formats inconsistent mmfs.log (Sobey, Richard A)


----------------------------------------------------------------------

Message: 1
Date: Sat, 2 Sep 2017 09:35:34 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: "'gpfsug-discuss at spectrumscale.org'"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID:

<VI1PR0602MB32292C575EFD00F708085F67DF930 at VI1PR0602MB3229.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Is there a good reason for the date formats in mmfs.log to be inconsistent?
Apart from my OCD getting the better of me, it makes log analysis a bit
difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script
mmSysMonGpfsStartup: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script
mmSinceShutdownRoleChange: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send
a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is
already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170902_4f65f336_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=fNT71mM8obJ9rwxzm3Uzxw4mayi2pQg1u950E1raYK4&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=


End of gpfsug-discuss Digest, Vol 68, Issue 4
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment.gif>

From john.hearns at asml.com  Mon Sep  4 08:43:59 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 4 Sep 2017 07:43:59 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
Message-ID: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

*        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
*        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/4b60a071/attachment.htm>

From r.sobey at imperial.ac.uk  Mon Sep  4 09:05:10 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 4 Sep 2017 08:05:10 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>,
	<HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322515F17C33CEF7FCBFDB00DF910@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Ah. I'm running 4.2.3 but haven't changed the release level. I'll get that sorted out.

Thanks for the replies!

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of John Hearns <john.hearns at asml.com>
Sent: Monday, September 4, 2017 8:43:59 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Date formats inconsistent mmfs.log

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

?        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
?        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/22e41b16/attachment.htm>

From ckrafft at de.ibm.com  Mon Sep  4 13:02:49 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Mon, 4 Sep 2017 12:02:49 +0000
Subject: [gpfsug-discuss] Looking for Use-Cases with Spectrum Scale / ESS
	with vRanger & VMware
Message-ID: <OF68690758.28E4DC1B-ON00258191.00422D1C-00258191.00422D1F@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/fc167793/attachment.htm>

From heiner.billich at psi.ch  Mon Sep  4 17:48:20 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Mon, 4 Sep 2017 16:48:20 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>

Hello,

We use AFM prefetch to migrate data between two clusters (using NFS). This works fine with large files, say 1+GB. But we have millions of smaller files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 1000+MB/s we get for larger files.

I assume that we would need more parallelism, does prefetch pull just one file at a time? So each file needs  some or many metadata operations plus a single  or just a few read and writes. Doing this sequentially adds up all the latencies of NFS+GPFS. This is my explanation. With larger files gpfs prefetch on home will help.

Please can anybody comment: Is this right, does AFM prefetch handle one file at a time in a sequential manner? And is there any way to change this behavior? Or am I wrong and I need to look elsewhere to get better performance for prefetch of many smaller files?

We will migrate several filesets in parallel, but still with individual filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 files/s seconds looks poor.

The setup is quite new, hence there may be other places to look at. 
It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.

Thank you,

Heiner
--,
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 
    
From vpuvvada at in.ibm.com  Tue Sep  5 15:27:21 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Tue, 5 Sep 2017 19:57:21 +0530
Subject: [gpfsug-discuss] Use AFM for migration of many small files
In-Reply-To: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
References: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
Message-ID: <OF97D680D8.B6748981-ON65258192.004D7F18-65258192.004F6849@notes.na.collabserv.com>

Which version of Spectrum Scale ? What is the fileset mode ?

>We use AFM prefetch to migrate data between two clusters (using NFS). 
This works fine with large files, say 1+GB. But we have millions of 
smaller files,  about 1MB each. Here >I see just ~150MB/s ? compare this 
to the 1000+MB/s we get for larger files.

How was the performance measured ? If parallel IO is enabled, AFM uses 
multiple gateway nodes to prefetch the large files (if file size if more 
than 1GB). Performance difference between small and lager file is huge 
(1000MB - 150MB = 850MB) here, and generally it is not the case. How many 
files were present in list file for prefetch ? Could you also share full 
internaldump from the gateway node ? 

>I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few >read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.

AFM prefetches the files on multiple threads. Default flush threads for 
prefetch are 36 (fileset.afmNumFlushThreads (default 4) + 
afmNumIOFlushThreads (default 32)). 

>Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and >I need to look elsewhere to get better 
performance for prefetch of many smaller files?

See above, AFM reads files on multiple threads parallelly.  Try increasing 
the afmNumFlushThreads on fileset and verify if it improves the 
performance.

~Venkat (vpuvvada at in.ibm.com)


From:   "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch>
To:     "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
Date:   09/04/2017 10:18 PM
Subject:        [gpfsug-discuss] Use AFM for migration of many small files
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hello,


We use AFM prefetch to migrate data between two clusters (using NFS). This 
works fine with large files, say 1+GB. But we have millions of smaller 
files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 
1000+MB/s we get for larger files.


I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.


Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and I need to look elsewhere to get better 
performance for prefetch of many smaller files?


We will migrate several filesets in parallel, but still with individual 
filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 
files/s seconds looks poor.


The setup is quite new, hence there may be other places to look at. 

It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.


Thank you,


Heiner

--,

Paul Scherrer Institut

Science IT

Heiner Billich

WHGA 106

CH 5232  Villigen PSI

056 310 36 02

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.psi.ch&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=eHcVdovN10-m-Qk0Ln2qvol3pkKNFwrzz2wgf1zXVXE&e= 


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=LbRyuSM_djs0FDXr27hPottQHAn3OGcivpyRcIDBN3U&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170905/3b28f7f8/attachment.htm>

From kenneth.waegeman at ugent.be  Wed Sep  6 12:55:20 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 6 Sep 2017 13:55:20 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
Message-ID: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>

Hi Sven,

I see two parameters that we have set to non-default values that are not 
in your list of options still to configure.

verbsRdmasPerConnection (256) and
socketMaxListenConnections (1024)

I remember we had to set socketMaxListenConnections because our cluster 
consist of +550 nodes.

Are these settings still needed, or is this also tackled in the code?

Thank you!!

Cheers,
Kenneth


On 02/09/17 00:42, Sven Oehme wrote:
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not 
> overridden the default settings. the reason we did this was that many 
> systems in the field including all ESS systems that come pre-tuned 
> where manually changed to 8k from the 16k default due to better 
> performance that was confirmed in multiple customer engagements and 
> tests with various settings , therefore we change the default to what 
> it should be in the field so people are not bothered to set it anymore 
> (simplification) or get benefits by changing the default to provides 
> better performance.
> all this happened when we did the communication code overhaul that did 
> lead to significant (think factors) of improved RPC performance for 
> RDMA and VERBS workloads.
> there is another round of significant enhancements coming soon , that 
> will make even more parameters either obsolete or change some of the 
> defaults for better out of the box performance.
> i see that we should probably enhance the communication of this 
> changes, not that i think this will have any negative effect compared 
> to what your performance was with the old setting i am actually pretty 
> confident that you get better performance with the new code, but by 
> setting parameters back to default on most 'manual tuned' probably 
> makes your system even faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have 
> anything set beside maxfilestocache, pagepool, workerthreads and 
> potential prefetch , if you are a protocol node, this and settings 
> specific to an  export (e.g. SMB, NFS set some special settings) , 
> pretty much everything else these days should be set to default so the 
> code can pick the correct parameters., if its not and you get better 
> performance by manual tweaking something i like to hear about it.
> on the communication side in the next release will eliminate another 
> set of parameters that are now 'auto set' and we plan to work on NSD 
> next.
> i presented various slides about the communication and simplicity 
> changes in various forums, latest public non NDA slides i presented 
> are here --> 
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu 
> <mailto:ewahl at osc.edu>> wrote:
>
>     Howdy.  Just noticed this change to min RDMA packet size and I
>     don't seem to
>     see it in any patch notes.  Maybe I just skipped the one where
>     this changed?
>
>      mmlsconfig verbsRdmaMinBytes
>     verbsRdmaMinBytes 16384
>
>     (in case someone thinks we changed it)
>
>     [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>     verbsRdma enable
>     verbsRdma disable
>     verbsRdmasPerConnection 14
>     verbsRdmasPerNode 1024
>     verbsPorts mlx5_3/1
>     verbsPorts mlx4_0
>     verbsPorts mlx5_0
>     verbsPorts mlx5_0 mlx5_1
>     verbsPorts mlx4_1/1
>     verbsPorts mlx4_1/2
>
>
>     Oddly I also see this in config, though I've seen these kinds of
>     things before.
>     mmdiag --config |grep verbsRdmaMinBytes
>        verbsRdmaMinBytes 8192
>
>     We're on a recent efix.
>     Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
>     --
>
>     Ed Wahl
>     Ohio Supercomputer Center
>     614-292-9302 <tel:%28614%29%20292-9302>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c0192e11/attachment.htm>

From olaf.weiser at de.ibm.com  Wed Sep  6 13:22:41 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 6 Sep 2017 14:22:41 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu><CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <OF3CD1E848.DCC8DD84-ONC1258193.0043A0CB-C1258193.0043FEC5@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e8bdd0b1/attachment.htm>

From Robert.Oesterlin at nuance.com  Wed Sep  6 13:29:44 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 6 Sep 2017 12:29:44 +0000
Subject: [gpfsug-discuss] Save the date! GPFS-UG meeting at SC17 - Sunday
	November 12th
Message-ID: <7838054B-8A46-46A0-8A53-81E3049B4AE7@nuance.com>

The 2017 Supercomputing conference is only 2 months away, and here?s a reminder to come early and attend the GPFS user group meeting. The meeting is tentatively scheduled from the afternoon of Sunday, November 12th. Exact location and times are still being discussed.

If you have an interest in presenting at the user group meeting, please let us know.

More details in the coming weeks.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/171593af/attachment.htm>

From damir.krstic at gmail.com  Wed Sep  6 13:35:45 2017
From: damir.krstic at gmail.com (Damir Krstic)
Date: Wed, 06 Sep 2017 12:35:45 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
Message-ID: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to the
/projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/7a6df4dd/attachment.htm>

From S.J.Thompson at bham.ac.uk  Wed Sep  6 13:43:09 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Wed, 6 Sep 2017 12:43:09 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <D5D5ABA0.51CF3%s.j.thompson@bham.ac.uk>

Filesets in filesets are fine. BUT if you use scoped backups with TSM... Er Spectrum Protect, then there are restrictions on creating an IFS inside an IFS ...

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>" <damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>>
Reply-To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, 6 September 2017 at 13:35
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [gpfsug-discuss] filesets inside of filesets

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports that. Otherwise, I was going to create a separate filesystem, create corresponding backup filesets on it and symlink them to the /projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/f4a93f39/attachment.htm>

From rohwedder at de.ibm.com  Wed Sep  6 13:51:47 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Wed, 6 Sep 2017 14:51:47 +0200
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <OFD591AA9A.36197B9E-ON00258193.0045ED39-C1258193.0046A8CE@notes.na.collabserv.com>


Hello Damir,

the files that belong to your fileset "backup" has a separate quota, it is
not related to the quota in "b1000".
There is no cumulative quota.

Fileset Nesting may need other considerations as well, in some cases
filesets behave different than simple directories.
-> For NFSV4 ACLs, inheritance stops at the fileset boundaries
-> Snapshots include the independent parent and the dependent children.
Nested independent filesets are not included in a fileset snapshot.
-> Export protocols like NFS or SMB will cross fileset boundaries and just
treat them like a directory.

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	Damir Krstic <damir.krstic at gmail.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/06/2017 02:36 PM
Subject:	[gpfsug-discuss] filesets inside of filesets
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to
the /projects/<projectname> directory.

Thanks in advance.

Damir_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=5jyA3TazAAOckIeQUeIG0CJ4TG0aMWv7jDLDk3gYNkE&s=CbzPKTgh7mO6om2LTQr94LM1qfshrEdm58cJydejAfE&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1B378274.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0002.gif>

From oehmes at gmail.com  Wed Sep  6 14:32:40 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Wed, 06 Sep 2017 13:32:40 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <CALssuR3ocqZ624YLfFXPYdqgJKqCPdCndvJpUURXj-_=-KKY-w@mail.gmail.com>

Hi,

you still need both of them, but they are both on the list to be removed,
the first is already integrated for the next major release, the 2nd we
still work on.

Sven

On Wed, Sep 6, 2017 at 4:55 AM Kenneth Waegeman <kenneth.waegeman at ugent.be>
wrote:

> Hi Sven,
>
> I see two parameters that we have set to non-default values that are not
> in your list of options still to configure.
> verbsRdmasPerConnection (256) and
> socketMaxListenConnections (1024)
>
> I remember we had to set socketMaxListenConnections because our cluster
> consist of +550 nodes.
>
> Are these settings still needed, or is this also tackled in the code?
>
> Thank you!!
>
> Cheers,
> Kenneth
>
>
>
> On 02/09/17 00:42, Sven Oehme wrote:
>
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not
> overridden the default settings. the reason we did this was that many
> systems in the field including all ESS systems that come pre-tuned where
> manually changed to 8k from the 16k default due to better performance that
> was confirmed in multiple customer engagements and tests with various
> settings , therefore we change the default to what it should be in the
> field so people are not bothered to set it anymore (simplification) or get
> benefits by changing the default to provides better performance.
> all this happened when we did the communication code overhaul that did
> lead to significant (think factors) of improved RPC performance for RDMA
> and VERBS workloads.
> there is another round of significant enhancements coming soon , that will
> make even more parameters either obsolete or change some of the defaults
> for better out of the box performance.
> i see that we should probably enhance the communication of this changes,
> not that i think this will have any negative effect compared to what your
> performance was with the old setting i am actually pretty confident that
> you get better performance with the new code, but by setting parameters
> back to default on most 'manual tuned' probably makes your system even
> faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have anything
> set beside maxfilestocache, pagepool, workerthreads and potential prefetch
> , if you are a protocol node, this and settings specific to an  export
> (e.g. SMB, NFS set some special settings) , pretty much everything else
> these days should be set to default so the code can pick the correct
> parameters., if its not and you get better performance by manual tweaking
> something i like to hear about it.
> on the communication side in the next release will eliminate another set
> of parameters that are now 'auto set' and we plan to work on NSD next.
> i presented various slides about the communication and simplicity changes
> in various forums, latest public non NDA slides i presented are here -->
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl < <ewahl at osc.edu>ewahl at osc.edu>
> wrote:
>
>> Howdy.   Just noticed this change to min RDMA packet size and I don't
>> seem to
>> see it in any patch notes.  Maybe I just skipped the one where this
>> changed?
>>
>>  mmlsconfig verbsRdmaMinBytes
>> verbsRdmaMinBytes 16384
>>
>> (in case someone thinks we changed it)
>>
>> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>> verbsRdma enable
>> verbsRdma disable
>> verbsRdmasPerConnection 14
>> verbsRdmasPerNode 1024
>> verbsPorts mlx5_3/1
>> verbsPorts mlx4_0
>> verbsPorts mlx5_0
>> verbsPorts mlx5_0 mlx5_1
>> verbsPorts mlx4_1/1
>> verbsPorts mlx4_1/2
>>
>>
>> Oddly I also see this in config, though I've seen these kinds of things
>> before.
>> mmdiag --config |grep verbsRdmaMinBytes
>>    verbsRdmaMinBytes 8192
>>
>> We're on a recent efix.
>> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>>
>> --
>>
>> Ed Wahl
>> Ohio Supercomputer Center
>> 614-292-9302 <%28614%29%20292-9302>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e1f559d1/attachment.htm>

From heiner.billich at psi.ch  Wed Sep  6 17:16:18 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Wed, 6 Sep 2017 16:16:18 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <7D6EFD03-5D74-4A7B-A0E8-2AD41B050E15@psi.ch>

Hello Venkateswara, Edward,

Thank you for the comments on how to speed up AFM prefetch with small files. We run 4.2.2-3 and the AFM mode is RO and we have just a single gateway, i.e. no parallel reads for large files. We will try to increase the value of afmNumFlushThreads. It wasn?t clear to me that these threads do read from home, too - at least for prefetch. First I will try a plain NFS mount and see how parallel reads of many small files  scale the throughput. Next I will try AFM prefetch. I don?t do nice benchmarking, just watching dstat output. We prefetch 100?000 files in one bunch, so there is ample time to observe. 

The basic issue is that we get just about 45MB/s for sequential read of  many 1000 files with 1MB per file on the home cluster. I.e. we read one file at a time before we switch to the next. This is no surprise. Each read takes about 20ms to complete, so at max we get 50 reads of 1MB per second. We?ve seen this on classical raid storage and on DSS/ESS systems. It?s likely just the physics of spinning disks and the fact that we do one read at a time and don?t allow any parallelism. We wait for one or two I/Os to single disks to complete before we continue  With larger files prefetch jumps in and fires many reads in parallel ? To get 1?000MB/s I need to do 1?000 read/s  and need to have ~20 reads in progress in parallel  all the time ? we?ll see how close we get to 1?000MB/s with ?many small files?.

Kind regards,

Heiner
--
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 

From stijn.deweirdt at ugent.be  Wed Sep  6 18:13:48 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 6 Sep 2017 19:13:48 +0200
Subject: [gpfsug-discuss] mixed verbsRdmaSend
Message-ID: <f2598b5c-f7c5-48f5-45ad-862976d97be5@ugent.be>

hi all,

what is the expected behaviour of a mixed verbsRdmaSend setup: some
nodes enabled, most disabled.

we have some nodes that have a very high iops workload, but most of the
cluster of 500+ nodes do not have such usecase.
we enabled verbsRdmaSend on the managers/quorum nodes (<10) and on the
few (<10) clients with this workload, but not on the others (500+). it
seems to work out fine, but is this acceptable as config? (the docs
mention that enabling verbsrdamSend on a> 100 nodes might lead to errors).


the nodes use ipoib as ip network, and running with verbsRdmaSend
disabled on all nodes leads to unstable cluster (TX errors (<1 error in
1M packets) on some clients leading to gpfs expel nodes etc).
(we still need to open a case wil mellanox to investigate further)

many thanks,

stijn


From gcorneau at us.ibm.com  Thu Sep  7 00:30:23 2017
From: gcorneau at us.ibm.com (Glen Corneau)
Date: Wed, 6 Sep 2017 18:30:23 -0500
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
Message-ID: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>

Sorry I missed the anniversary of your conception  (announcement letter) 
back on August 26th, so I hope you'll accept my belated congratulations on 
this long and exciting journey!

https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318

I remember your parent, PIOFS, as well!  Ahh the fun times.
------------------
Glen Corneau
Power Systems
Washington Systems Center
gcorneau at us.ibm.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 26117 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment.jpe>

From xhejtman at ics.muni.cz  Thu Sep  7 16:07:20 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 7 Sep 2017 17:07:20 +0200
Subject: [gpfsug-discuss] Overwritting migrated files
Message-ID: <20170907150720.h3t5fowvdlibvik4@ics.muni.cz>

Hello,

we have files about 100GB per file. Many of these files are migrated to tapes.
(GPFS+TSM, tape storage is external pool and dsmmigrate, dsmrecall are in
place).

These files are images from bacula backup system. When bacula wants to reuse
some of images, it needs to truncate the file to 64kB and overwrite it.

Is there a way not to recall whole 100GB from tapes for only to truncate the
file?

I tried to do partial recall:
dsmrecall -D -size=65k Vol03797

after recall processing finished, I tried to truncate the file using:
dd if=/dev/zero of=Vol03797 count=0 bs=64k seek=1

which caused futher recall of the whole file:

$ dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:01:59
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
107380819676     10485760     31373312   m (p)  Vol03797

and ResB size has been growing to 107380819676.

After dd finished:

dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:08:03
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
       65536        65536           64   r      Vol03797


Is there another way to truncate the file and drop whole migrated part?

-- 
Luk?? Hejtm?nek


From john.hearns at asml.com  Thu Sep  7 16:15:00 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:15:00 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
Message-ID: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/a7b29e8b/attachment.htm>

From neil.wilson at metoffice.gov.uk  Thu Sep  7 16:33:58 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Thu, 7 Sep 2017 15:33:58 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

I think you need to configure a gateway node (use mmchnode to change an existing node class to gateway)
Then use mmafmconfig to setup export server maps on the gateway node.

e.g.
mmafmconfig -add "mapping1" -export-map "nfsServerIP"/"GatewayNode"  (double quotes not required)

mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/001ddb64/attachment.htm>

From john.hearns at asml.com  Thu Sep  7 16:52:19 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:52:19 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
Message-ID: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>

Firmly lining myself up for a smack round the chops with a wet haddock...
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/292c9f2c/attachment.htm>

From janusz.malka at desy.de  Thu Sep  7 20:23:36 2017
From: janusz.malka at desy.de (Malka, Janusz)
Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>

I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com> 
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
Sent: Thursday, 7 September, 2017 17:52:19 
Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 


Firmly lining myself up for a smack round the chops with a wet haddock? 

I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 

Mmdelfileset responds that : 

Fileset obfuscated has 1 fileset snapshot(s). 


When I try to delete the snapshot: 

Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 


I find this reference, which is about as useful as a wet haddock: 

[ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 


The advice of the gallery is sought, please. 


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
_______________________________________________ 
gpfsug-discuss mailing list 
gpfsug-discuss at spectrumscale.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/6da4c433/attachment.htm>

From christof.schmitt at us.ibm.com  Thu Sep  7 22:16:34 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Thu, 7 Sep 2017 21:16:34 +0000
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
In-Reply-To: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
References: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
Message-ID: <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/04778952/attachment.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 03:11:48 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 22:11:48 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
Message-ID: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>

Hi Everyone,

This is something that's come up in the past and has recently resurfaced 
with a project I've been working on, and that is-- it seems to me as 
though mmfsd never attempts to flush the cache of the block devices its 
writing to (looking at blktrace output seems to confirm this). Is this 
actually the case? I've looked at the gpl headers for linux and I don't 
see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or 
REQ_FLUSH. I'm sure there's other ways to trigger this behavior that 
GPFS may very well be using that I've missed. That's why I'm asking :)

I figure with FPO being pushed as an HDFS replacement using commodity 
drives this feature has *got* to be in the code somewhere.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From oehmes at gmail.com  Fri Sep  8 03:55:14 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 02:55:14 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
Message-ID: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>

I am not sure what exactly you are looking for but all blockdevices are
opened with O_DIRECT , we never cache anything on this layer .

On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov> wrote:

> Hi Everyone,
>
> This is something that's come up in the past and has recently resurfaced
> with a project I've been working on, and that is-- it seems to me as
> though mmfsd never attempts to flush the cache of the block devices its
> writing to (looking at blktrace output seems to confirm this). Is this
> actually the case? I've looked at the gpl headers for linux and I don't
> see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> GPFS may very well be using that I've missed. That's why I'm asking :)
>
> I figure with FPO being pushed as an HDFS replacement using commodity
> drives this feature has *got* to be in the code somewhere.
>
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b510bc51/attachment.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 04:05:42 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:05:42 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
Message-ID: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>

Thanks Sven. I didn't think GPFS itself was caching anything on that 
layer, but it's my understanding that O_DIRECT isn't sufficient to force 
I/O to be flushed (e.g. the device itself might have a volatile caching 
layer). Take someone using ZFS zvol's as NSDs. I can write() all day log 
to that zvol (even with O_DIRECT) but there is absolutely no guarantee 
those writes have been committed to stable storage and aren't just 
sitting in RAM until an fsync() occurs (or some other bio function that 
causes a flush). I also don't believe writing to a SATA drive with 
O_DIRECT will force cache flushes of the drive's writeback cache.. 
although I just tested that one and it seems to actually trigger a scsi 
cache sync. Interesting.

-Aaron

On 9/7/17 10:55 PM, Sven Oehme wrote:
> I am not sure what exactly you are looking for but all blockdevices are 
> opened with O_DIRECT , we never cache anything on this layer .
> 
> 
> On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov 
> <mailto:aaron.s.knister at nasa.gov>> wrote:
> 
>     Hi Everyone,
> 
>     This is something that's come up in the past and has recently resurfaced
>     with a project I've been working on, and that is-- it seems to me as
>     though mmfsd never attempts to flush the cache of the block devices its
>     writing to (looking at blktrace output seems to confirm this). Is this
>     actually the case? I've looked at the gpl headers for linux and I don't
>     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
>     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
>     GPFS may very well be using that I've missed. That's why I'm asking :)
> 
>     I figure with FPO being pushed as an HDFS replacement using commodity
>     drives this feature has *got* to be in the code somewhere.
> 
>     -Aaron
> 
>     --
>     Aaron Knister
>     NASA Center for Climate Simulation (Code 606.2)
>     Goddard Space Flight Center
>     (301) 286-2776
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Fri Sep  8 04:26:02 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:26:02 -0400
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
In-Reply-To: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
References: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
Message-ID: <4a9feeb2-bb9d-8c9a-e506-926d8537cada@nasa.gov>

Sounds like celebratory cake is in order for the users group in a few 
weeks ;)

On 9/6/17 7:30 PM, Glen Corneau wrote:
> Sorry I missed the anniversary of your conception ?(announcement letter) 
> back on August 26th, so I hope you'll accept my belated congratulations 
> on this long and exciting journey!
> 
> https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318
> 
> I remember your parent, PIOFS, as well! ?Ahh the fun times.
> ------------------
> Glen Corneau
> Power Systems
> Washington Systems Center
> gcorneau at us.ibm.com
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From vpuvvada at in.ibm.com  Fri Sep  8 06:00:46 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:30:46 +0530
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
Message-ID: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be 
deleted by user using mmdelsnapshot command directly.  After recovery or 
resync completion they get deleted automatically. For peer snapshots 
deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p 
(undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com)


From:   "Malka, Janusz" <janusz.malka at desy.de>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it 
has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).
 
When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and 
cannot be deleted by user.
 
I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm
 
The advice of the gallery is sought, please.
 
 
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/defb469e/attachment.htm>

From vpuvvada at in.ibm.com  Fri Sep  8 06:21:47 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:51:47 +0530
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF77540C52.9CF7C33D-ON65258195.001D01AB-65258195.001D75FA@notes.na.collabserv.com>

mmafmconfig command should be run on the target path (path specified in 
the afmTarget option when fileset is created). If many filesets are 
sharing the same target (ex independent writer mode) , enable AFM once on 
target path. Run the command at home cluster.

mmafmconifg enable afmTarget 

~Venkat (vpuvvada at in.ibm.com)


From:   "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/07/2017 09:04 PM
Subject:        Re: [gpfsug-discuss] AFM from generic NFS share - 
mmafmconfig
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I think you need to configure a gateway node (use mmchnode to change an 
existing node class to gateway) 
Then use mmafmconfig to setup export server maps on the gateway node.
 
e.g. 
mmafmconfig ?add ?mapping1? ?export-map ?nfsServerIP?/?GatewayNode? 
(double quotes not required)
 
mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode
 
 
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
 
If I have an AFM setup where the home is located on a generic NFS share, 
let?s say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on 
/volume/share ?
I can mount the share as a plain old NFS mount in order to run this 
operation, before I create the cache side fileset.
 
Apologies if I am being dumb as an ox here, and I deserve to be slapped in 
the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=kKlSEJqmVE6q8Qt02JNaDLsewp13C0yRAmlfc_djRkk&s=JIbuXlCiReZx3ws5__6juuGC-sAqM74296BuyzgyNYg&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/67635871/attachment.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:04:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:04:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <0CBB283A-A0A9-4FC9-A1CD-9E019D74CDB9@ocf.co.uk>

I am still populating your lot 2 response - it is split across 3 word docs and a whole heap of emails so easier for me to keep going - I dropped u off a lot of emails to save filling your inbox :-)

Could you poke around other tenders for the portal question please?

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From john.hearns at asml.com  Fri Sep  8 08:26:01 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 07:26:01 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
Message-ID: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/f6e9c311/attachment.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:33:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:33:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From Sandra.McLaughlin at astrazeneca.com  Fri Sep  8 10:12:02 2017
From: Sandra.McLaughlin at astrazeneca.com (McLaughlin, Sandra M)
Date: Fri, 8 Sep 2017 09:12:02 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________

AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://www.astrazeneca.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/af4d206c/attachment.htm>

From john.hearns at asml.com  Fri Sep  8 11:57:14 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 10:57:14 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB14504373DC64177A95242B9588950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Sandra,
   Thankyou for the help.  I have a support ticket outstanding, and will see what is suggested.
I am sure this is a simple matter of deleting the fileset as you say!

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b56be110/attachment.htm>

From kkr at lbl.gov  Fri Sep  8 11:58:05 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Fri, 8 Sep 2017 03:58:05 -0700
Subject: [gpfsug-discuss] Hold the Date - Spectrum Scale Day @ HPCXXL
 (Sept 2017, NYC)
In-Reply-To: <6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
References: <A52508C1-582D-4024-9313-F06EE774D0E9@lbl.gov>
	<BF39E3E7-D384-460A-816F-F55CD0520DEE@lbl.gov>
	<6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
Message-ID: <D4B42140-D3C3-4D95-B055-D988293972BC@lbl.gov>

Hello,

	The agenda for the GPFS Day during HPCXXL is fairly fleshed out here:

http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>

	See notes on registration below, which is free but required. Use the HPCXXL registration form, which has a $0 GPFS Day registration option.

	Hope to see some of you there.

Best,
Kristy


> On Aug 21, 2017, at 3:33 PM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> If you plan on attending the GPFS Day, please use the HPCXXL registration form (link to Eventbrite registration at the link below). The GPFS day is a free event, but you *must* register so we can make sure there are enough seats and food available. 
> 
> If you would like to speak or suggest a topic, please let me know.
> 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> The agenda is still being worked on, here are some likely topics:
> 
> --RoadMap/Updates
> --"New features - New Bugs? (Julich)
> --GPFS + Openstack (CSCS) 
> --ORNL Update on Spider3-related GPFS work
> --ANL Site Update
> --File Corruption Session
> 
> Best,
> Kristy
> 
>> On Aug 8, 2017, at 11:33 AM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>> 
>> Hello,
>> 
>> 	The GPFS Day of the HPCXXL conference is confirmed for Thursday, September 28th. Here is an updated URL, the agenda and registration are still being put together http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>. The GPFS Day will require registration, so we can make sure there is enough room (and coffee/food) for all attendees ?however, there will be no registration fee if you attend the GPFS Day only.
>> 
>> 	I?ll send another update when the agenda is closer to settled.
>> 
>> Cheers,
>> Kristy
>> 
>>> On Jul 7, 2017, at 3:32 PM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>>> 
>>> Hello,
>>> 
>>>   More details will be provided as they become available, but just so you can make a placeholder on your calendar, there will be a Spectrum Scale Day the week of September 25th - 29th, likely Thursday, September 28th. 
>>> 
>>>   This will be a part of the larger HPCXXL meeting (https://www.spxxl.org/?q=New-York-City-2017 <https://www.spxxl.org/?q=New-York-City-2017>). You may recall this group was formerly called SPXXL and the website is in the process of transitioning to the new name (and getting a new certificate). You will be able to attend *just* the Spectrum Scale day if that is the only portion of the event you would like to attend. 
>>> 
>>>   The NYC location is great for Spectrum Scale events because many IBMers, including developers, can come in from Poughkeepsie.
>>> 
>>>   More as we get closer to the date and details are settled.
>>> 
>>> Cheers,
>>> Kristy
>> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/110fea2d/attachment.htm>

From hpc.ken.tw25qn at gmail.com  Fri Sep  8 19:30:32 2017
From: hpc.ken.tw25qn at gmail.com (Ken Atkinson)
Date: Fri, 8 Sep 2017 19:30:32 +0100
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
	<93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>
	<CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
Message-ID: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>

Not on too many G&Ts Georgina?
How are things.
Ken Atkinson

On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org"
<gpfsug-discuss-request at spectrumscale.org> wrote:
>
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
>
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
>
> I had similar issue, I had to recover connection to home
>
>
> From: "John Hearns" <john.hearns at asml.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Sent: Thursday, 7 September, 2017 17:52:19
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
>
>
>
> Firmly lining myself up for a smack round the chops with a wet haddock?
>
> I try to delete an AFM cache fileset which I create da few days ago (it
has an NFS home)
>
> Mmdelfileset responds that :
>
> Fileset obfuscated has 1 fileset snapshot(s).
>
>
>
> When I try to delete the snapshot:
>
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and
cannot be deleted by user.
>
>
>
> I find this reference, which is about as useful as a wet haddock:
>
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm |
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
>
>
>
> The advice of the gallery is sought, please.
>
>
>
>
>
>
> -- The information contained in this communication and any attachments is
confidential and may be privileged, and is for the sole use of the intended
recipient(s). Any unauthorized review, use, disclosure or distribution is
prohibited. Unless explicitly stated otherwise in the body of this
communication or the attachment thereto (if any), the information is
provided on an AS-IS basis without any express or implied warranties or
liabilities. To the extent you are relying on this information, you are
doing so at your own risk. If you are not the intended recipient, please
notify the sender immediately by replying to this message and destroy all
copies of this message and any attachments. Neither the sender nor the
company/group of companies he or she represents shall be liable for the
proper and complete transmission of the information contained in this
communication, or for any delay in its receipt.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/6da4c433/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.
0074DFDC at notes.na.collabserv.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/04778952/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/03c6bc78/attachment.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 22:14:04 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Fri, 8 Sep 2017 17:14:04 -0400
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov>
	<OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
Message-ID: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>

Interesting! Thank you for the explanation.

This makes me wish GPFS had a client access model that more closely
mimicked parallel NAS, specifically for this reason. That then got me
wondering about pNFS support. I've not been able to find much about that
but in theory Ganesha supports pNFS. Does anyone know of successful pNFS
testing with GPFS and if so how one would set up such a thing?

-Aaron

On 08/25/2017 06:41 PM, IBM Spectrum Scale wrote:
>
> Hi Aaron,
>
> If cluster A uses the mmauth command to grant a file system read-only
> access to a remote cluster B, nodes on cluster B can only mount that
> file system with read-only access. But the only checking being done at
> the RPC level is the TLS authentication. This should prevent non-root
> users from initiating RPCs, since TLS authentication requires access
> to the local cluster's private key. However, a root user on cluster B,
> having access to cluster B's private key, might be able to craft RPCs
> that may allow one to work around the checks which are implemented at
> the file system level.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks
> Forum at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please
> contact 1-800-237-5511 in the United States or your local IBM Service
> Center in other countries.
>
> The forum is informally monitored as time permits and should not be
> used for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for Aaron Knister ---08/21/2017 11:04:06 PM---Hi
> Everyone, I have a theoretical question about GPFS multiAaron Knister
> ---08/21/2017 11:04:06 PM---Hi Everyone, I have a theoretical question
> about GPFS multiclusters and security.
>
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date: 08/21/2017 11:04 PM
> Subject: [gpfsug-discuss] multicluster security
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
>
> ------------------------------------------------------------------------
>
>
>
> Hi Everyone,
>
> I have a theoretical question about GPFS multiclusters and security.
> Let's say I have clusters A and B. Cluster A is exporting a filesystem
> as read-only to cluster B.
>
> Where does the authorization burden lay? Meaning, does the security rely
> on mmfsd in cluster B to behave itself and enforce the conditions of the
> multi-cluster export? Could someone using the credentials on a
> compromised node in cluster B just start sending arbitrary nsd
> read/write commands to the nsds from cluster A (or something along those
> lines)? Do the NSD servers in cluster A do any sort of sanity or
> security checking on the I/O requests coming from cluster B to the NSDs
> they're serving to exported filesystems?
>
> I imagine any enforcement would go out the window with shared disks in a
> multi-cluster environment since a compromised node could just "dd" over
> the LUNs.
>
> Thanks!
>
> -Aaron
>
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=oK_bEPbjuD7j6qLTHbe7HM4ujUlpcNYtX3tMW2QC7_w&s=BliMQ0pToLIIiO1jfyUp2Q3icewcONrcmHpsIj_hMtY&e= 
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment.gif>

From oehmes at gmail.com  Fri Sep  8 22:21:00 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 21:21:00 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
	<f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
Message-ID: <CALssuR2hvCFnoecUy4G42fDW6s6buQ5BHrNAd3y+xjQEO509qA@mail.gmail.com>

Hi,

the code assumption is that the underlying device has no volatile write
cache, i was absolute sure we have that somewhere in the FAQ, but i
couldn't find it, so i will talk to somebody to correct this.
if i understand
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
correct
one could enforce this by setting REQ_FUA, but thats not explicitly set
today, at least i can't see it. i will discuss this with one of our devs
who owns this code and come back.

sven


On Thu, Sep 7, 2017 at 8:05 PM Aaron Knister <aaron.s.knister at nasa.gov>
wrote:

> Thanks Sven. I didn't think GPFS itself was caching anything on that
> layer, but it's my understanding that O_DIRECT isn't sufficient to force
> I/O to be flushed (e.g. the device itself might have a volatile caching
> layer). Take someone using ZFS zvol's as NSDs. I can write() all day log
> to that zvol (even with O_DIRECT) but there is absolutely no guarantee
> those writes have been committed to stable storage and aren't just
> sitting in RAM until an fsync() occurs (or some other bio function that
> causes a flush). I also don't believe writing to a SATA drive with
> O_DIRECT will force cache flushes of the drive's writeback cache..
> although I just tested that one and it seems to actually trigger a scsi
> cache sync. Interesting.
>
> -Aaron
>
> On 9/7/17 10:55 PM, Sven Oehme wrote:
> > I am not sure what exactly you are looking for but all blockdevices are
> > opened with O_DIRECT , we never cache anything on this layer .
> >
> >
> > On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov
> > <mailto:aaron.s.knister at nasa.gov>> wrote:
> >
> >     Hi Everyone,
> >
> >     This is something that's come up in the past and has recently
> resurfaced
> >     with a project I've been working on, and that is-- it seems to me as
> >     though mmfsd never attempts to flush the cache of the block devices
> its
> >     writing to (looking at blktrace output seems to confirm this). Is
> this
> >     actually the case? I've looked at the gpl headers for linux and I
> don't
> >     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> >     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> >     GPFS may very well be using that I've missed. That's why I'm asking
> :)
> >
> >     I figure with FPO being pushed as an HDFS replacement using commodity
> >     drives this feature has *got* to be in the code somewhere.
> >
> >     -Aaron
> >
> >     --
> >     Aaron Knister
> >     NASA Center for Climate Simulation (Code 606.2)
> >     Goddard Space Flight Center
> >     (301) 286-2776
> >     _______________________________________________
> >     gpfsug-discuss mailing list
> >     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
> >     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b2985540/attachment.htm>

From olaf.weiser at de.ibm.com  Sat Sep  9 09:05:31 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Sat, 9 Sep 2017 10:05:31 +0200
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov><OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
	<529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
Message-ID: <OF2A0CBA52.519123DC-ONC1258196.002ABF0D-C1258196.002C73B4@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment.gif>

From aaron.s.knister at nasa.gov  Mon Sep 11 01:43:56 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:43:56 -0400
Subject: [gpfsug-discuss] tuning parameters question
Message-ID: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>

Hi All (but mostly Sven),

I stumbled across this great gem:

files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf

and I'm wondering which, if any, of those tuning parameters are still 
relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
particularly ugly and the storage doesn't appear to be bottlenecked.

I see a lot of waiters like these:

Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'

and I'm wondering if there's anything immediate one would suggest to 
help with that.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Mon Sep 11 01:50:39 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:50:39 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
Message-ID: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>

As an aside, my initial attempt was to use Ganesha via CES but the 
performance was significantly worse than CNFS for this workload. The 
docs seem to suggest that CNFS performs better for metadata intensive 
workloads which certainly seems to fit the bill here.

-Aaron

On 9/10/17 8:43 PM, Aaron Knister wrote:
> Hi All (but mostly Sven),
> 
> I stumbled across this great gem:
> 
> files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
> 
> and I'm wondering which, if any, of those tuning parameters are still 
> relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
> exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
> particularly ugly and the storage doesn't appear to be bottlenecked.
> 
> I see a lot of waiters like these:
> 
> Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> 
> and I'm wondering if there's anything immediate one would suggest to 
> help with that.
> 
> -Aaron
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From stefan.dietrich at desy.de  Mon Sep 11 08:40:14 2017
From: stefan.dietrich at desy.de (Dietrich, Stefan)
Date: Mon, 11 Sep 2017 09:40:14 +0200 (CEST)
Subject: [gpfsug-discuss] Switch from IPoIB connected mode to datagram with
	ESS 5.2.0?
Message-ID: <743361352.9211728.1505115614463.JavaMail.zimbra@desy.de>

Hello,

during reading the upgrade docs for ESS 5.2.0, I noticed a change in the IPoIB mode.
Now it specifies, that datagram (CONNECTED_MODE=no) instead of connected mode should be used.
All earlier versions used connected mode.

I am wondering about the reason for this change?
Or is this only relevant for bonded IPoIB interfaces?

Regards,
Stefan

--
------------------------------------------------------------------------
Stefan Dietrich            Deutsches Elektronen-Synchrotron (IT-Systems)
                        Ein Forschungszentrum der Helmholtz-Gemeinschaft
                                                            Notkestr. 85
phone:  +49-40-8998-4696                                   22607 Hamburg
e-mail: stefan.dietrich at desy.de                                  Germany
------------------------------------------------------------------------


From john.hearns at asml.com  Mon Sep 11 08:41:54 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 11 Sep 2017 07:41:54 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB1450707CD4CB6EECBE7D1F0B88680@HE1PR02MB1450.eurprd02.prod.outlook.com>

Thankyou all for advice.
The ?-p? option was the fix here (thankyou to IBM support).


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/c5cba04f/attachment.htm>

From olaf.weiser at de.ibm.com  Mon Sep 11 09:11:15 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Mon, 11 Sep 2017 10:11:15 +0200
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
Message-ID: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/28008568/attachment.htm>

From ed.swindelles at uconn.edu  Mon Sep 11 16:49:15 2017
From: ed.swindelles at uconn.edu (Swindelles, Ed)
Date: Mon, 11 Sep 2017 15:49:15 +0000
Subject: [gpfsug-discuss] UConn hiring GPFS administrator
Message-ID: <D1C812AF-5A0D-473C-91BC-A4F97764ED5B@uconn.edu>

The University of Connecticut is hiring three full time, permanent technical positions for its HPC team on the Storrs campus. One of these positions is focused on storage administration, including a GPFS cluster. I would greatly appreciate it if you would forward this announcement to contacts of yours who may have an interest in these positions. Here are direct links to the job descriptions and applications:

HPC Storage Administrator
http://s.uconn.edu/3tx

HPC Systems Administrator (2 positions to be filled)
http://s.uconn.edu/3tw

Thank you,

--
Ed Swindelles
Team Lead for Research Technology
University of Connecticut
860-486-4522

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/0c8c61d8/attachment.htm>

From aaron.s.knister at nasa.gov  Mon Sep 11 23:15:10 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Mon, 11 Sep 2017 18:15:10 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
	<OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
Message-ID: <9de64193-c60c-8ee1-b681-6cfe3993772b@nasa.gov>

Thanks, Olaf. I ended up un-setting a bunch of settings that are now 
auto-tuned (worker1threads, worker3threads, etc.) and just set 
workerthreads as you suggest. That combined with increasing 
maxfilestocache to above the max concurrent open file threshold of the 
workload got me consistently with in 1%-3% of the performance of the 
same storage hardware running btrfs instead of GPFS. I think that's 
pretty darned good considering the additional complexity GPFS has over 
btrfs of being a clustered filesystem. Plus I now get NFS server 
failover for very little effort and without having to deal with corosync 
or pacemaker.

-Aaron

On 9/11/17 4:11 AM, Olaf Weiser wrote:
> Hi Aaron ,
> 
> 0,0009 s response time for your meta data IO ... seems to be a very 
> good/fast storage BE.. which is hard to improve..
> you can raise the parallelism a bit for accessing metadata , but if this 
> will help to improve your "workload" is not assured
> 
> The worker3threads parameter specifies the number of threads to use for 
> inode prefetch. Usually , I would suggest, that you should not touch 
> single parameters any longer. By the great improvements of the last few 
> releases.. GPFS can calculate / retrieve the right settings 
> semi-automatically...
> You only need to set simpler "workerThreads" ..
> 
> But in your case , you can see, if this more specific value will help 
> you out .
> 
> depending on your blocksize and average filesize .. you may see 
> additional improvements when tuning nfsPrefetchStrategy , which tells 
> GPFS to consider all IOs wihtin */N/* blockboundaries as sequential ?and 
> starts prefetch
> 
> l.b.n.t. set ignoreprefetchLunCount to yes .. (if not already done) . 
> this helps GPFS to use all available workerThreads
> 
> cheers
> olaf
> 
> 
> 
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: <gpfsug-discuss at spectrumscale.org>
> Date: 09/11/2017 02:50 AM
> Subject: Re: [gpfsug-discuss] tuning parameters question
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------------------------------------------------
> 
> 
> 
> As an aside, my initial attempt was to use Ganesha via CES but the
> performance was significantly worse than CNFS for this workload. The
> docs seem to suggest that CNFS performs better for metadata intensive
> workloads which certainly seems to fit the bill here.
> 
> -Aaron
> 
> On 9/10/17 8:43 PM, Aaron Knister wrote:
>  > Hi All (but mostly Sven),
>  >
>  > I stumbled across this great gem:
>  >
>  > files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
>  >
>  > and I'm wondering which, if any, of those tuning parameters are still
>  > relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm
>  > exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is
>  > particularly ugly and the storage doesn't appear to be bottlenecked.
>  >
>  > I see a lot of waiters like these:
>  >
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 2881
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26231
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26146
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 18637
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25013
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 27879
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26553
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25334
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25337
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  >
>  > and I'm wondering if there's anything immediate one would suggest to
>  > help with that.
>  >
>  > -Aaron
>  >
> 
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From zacekm at img.cas.cz  Tue Sep 12 10:40:35 2017
From: zacekm at img.cas.cz (Michal Zacek)
Date: Tue, 12 Sep 2017 11:40:35 +0200
Subject: [gpfsug-discuss] Wrong nodename after server restart
Message-ID: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>

Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)


From secretary at gpfsug.org  Tue Sep 12 15:22:41 2017
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Tue, 12 Sep 2017 15:22:41 +0100
Subject: [gpfsug-discuss] SS UG UK 2018
Message-ID: <c3a112a27d1dcdd9ec0e3cba8e11be1b@webmail.gpfsug.org>

 
Dear all, 

A date for your diary, #SSUG18 in the UK will be taking place on April
18th & 19th 2018. Please mark it in your diaries now! 

We'll confirm other details (venue, agenda etc.) nearer the time, but
the date is confirmed. 

Thanks, 
-- 

Claire O'Toole
Spectrum Scale/GPFS User Group Secretary
+44 (0)7508 033896
www.spectrumscaleug.org
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/5db4ccf8/attachment.htm>

From scale at us.ibm.com  Tue Sep 12 16:01:21 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 12 Sep 2017 11:01:21 -0400
Subject: [gpfsug-discuss] Wrong nodename after server restart
In-Reply-To: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
References: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
Message-ID: <OFD1EB20F8.CE480295-ON85258199.0052675E-85258199.0052859A@notes.na.collabserv.com>

Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name 
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local 
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local 
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local 
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local 
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/aba30a31/attachment.htm>

From daniel.kidger at uk.ibm.com  Tue Sep 12 16:36:06 2017
From: daniel.kidger at uk.ibm.com (Daniel Kidger)
Date: Tue, 12 Sep 2017 15:36:06 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>
Message-ID: <OF1C8113B1.DE118C7C-ON00258199.0055B3FB-1505230566579@notes.na.collabserv.com>

Well George is not the only one to have replied to the list with a one to one message.
?

Remember folks, this mailing list has a *lot* of people on it.
Hope my message is last that forgets who is in the 'To' field.

Daniel

Daniel Kidger 
Technical Sales Specialist, IBM UK
IBM Spectrum Storage Software
daniel.kidger at uk.ibm.com
+44 (0)7818 522266


> On 8 Sep 2017, at 19:30, Ken Atkinson <hpc.ken.tw25qn at gmail.com> wrote:
> 
> Not on too many G&Ts Georgina?
> How are things.
> Ken Atkinson
> 
> On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:
> Apologies All, slip of the keyboard and not a comment on GPFS!
> 
> Sent from my iPhone
> 
> > On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> >
> > Send gpfsug-discuss mailing list submissions to
> >    gpfsug-discuss at spectrumscale.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > or, via email, send a message with subject or body 'help' to
> >    gpfsug-discuss-request at spectrumscale.org
> >
> > You can reach the person managing the list at
> >    gpfsug-discuss-owner at spectrumscale.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of gpfsug-discuss digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
> >   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> > From: "Malka, Janusz" <janusz.malka at desy.de>
> > To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> > Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> > Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> > Content-Type: text/plain; charset="utf-8"
> >
> > I had similar issue, I had to recover connection to home
> >
> >
> > From: "John Hearns" <john.hearns at asml.com>
> > To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> > Sent: Thursday, 7 September, 2017 17:52:19
> > Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
> >
> >
> >
> > Firmly lining myself up for a smack round the chops with a wet haddock?
> >
> > I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
> >
> > Mmdelfileset responds that :
> >
> > Fileset obfuscated has 1 fileset snapshot(s).
> >
> >
> >
> > When I try to delete the snapshot:
> >
> > Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.
> >
> >
> >
> > I find this reference, which is about as useful as a wet haddock:
> >
> > [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
> >
> >
> >
> > The advice of the gallery is sought, please.
> >
> >
> >
> >
> >
> >
> > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 7 Sep 2017 21:16:34 +0000
> > From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> > To: gpfsug-discuss at spectrumscale.org
> > Cc: gpfsug-discuss at spectrumscale.org
> > Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> > Message-ID:
> >    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
> >
> > Content-Type: text/plain; charset="us-ascii"
> >
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> >
> > ------------------------------
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> > End of gpfsug-discuss Digest, Vol 68, Issue 13
> > **********************************************
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/65278177/attachment.htm>

From jonathan.mills at nasa.gov  Tue Sep 12 17:06:23 2017
From: jonathan.mills at nasa.gov (Jonathan Mills)
Date: Tue, 12 Sep 2017 12:06:23 -0400 (EDT)
Subject: [gpfsug-discuss] Support for SLES 12 SP3
Message-ID: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>

SLES 12 SP3 has been released.  And for what it?s worth, there does not 
appear to be substantial changes in either kernel or glibc as compared to 
SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, 
while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I 
wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported 
version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this 
fall would prefer to do so on the latest rev of our OS, as opposed to one 
that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov
NASA GSFC / NCCS HPC (606.2)
Bldg 28, Rm. S230 / c. 252-412-5710

From Greg.Lehmann at csiro.au  Wed Sep 13 00:12:55 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Tue, 12 Sep 2017 23:12:55 +0000
Subject: [gpfsug-discuss] Support for SLES 12 SP3
In-Reply-To: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
References: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
Message-ID: <67f390a558244c41b154a7a6a9e5efe8@exch1-cdc.nexus.csiro.au>

+1. We are interested in SLES 12 SP3 too. 

BTW had anybody done any comparisons of SLES 12 SP2 (4.4) kernel vs RHEL 7.3 in terms of GPFS IO performance? I would think the 4.4 kernel might give it an edge. I'll probably get around to comparing them myself one day, but if anyone else has some numbers...

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Mills
Sent: Wednesday, 13 September 2017 2:06 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Support for SLES 12 SP3

SLES 12 SP3 has been released.  And for what it?s worth, there does not appear to be substantial changes in either kernel or glibc as compared to SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this fall would prefer to do so on the latest rev of our OS, as opposed to one that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov NASA GSFC / NCCS HPC (606.2) Bldg 28, Rm. S230 / c. 252-412-5710


From scale at us.ibm.com  Wed Sep 13 22:33:30 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 13 Sep 2017 17:33:30 -0400
Subject: [gpfsug-discuss] Fw:  Wrong nodename after server restart
Message-ID: <OFA4664B84.05767CE8-ON8525819A.00764969-8525819A.00766C39@us.ibm.com>

----- Forwarded by Eric Agar/Poughkeepsie/IBM on 09/13/2017 05:32 PM -----

From:   IBM Spectrum Scale/Poughkeepsie/IBM
To:     Michal Zacek <zacekm at img.cas.cz>
Date:   09/13/2017 05:29 PM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart
Sent by:        Eric Agar


Hello Michal,

It should not be necessary to delete whale.img.cas.cz and rename it.  But, 
that is an option you can take, if you prefer it. If you decide to take 
that option, please see the last paragraph of this response.

The confusion starts at the moment a node is added to the active cluster 
where the new node does not have the same common domain suffix as the 
nodes that were already in the cluster.  The confusion increases when the 
GPFS daemons on some nodes, but not all nodes, are recycled.  Doing 
mmshutdown -a, followed by mmstartup -a, once after the new node has been 
added allows all GPFS daemons on all nodes to come up at the same time and 
arrive at the same answer to the question, "what is the common domain 
suffix for all the nodes in the cluster now?"  In the case of your 
cluster, the answer will be "the common domain suffix is the empty string" 
or, put another way, "there is no common domain suffix"; that is okay, as 
long as all the GPFS daemons come to the same conclusion.

After you recycle the cluster, you can check to make sure all seems well 
by running "tsctl shownodes up" on every node, and make sure the answer is 
correct on each node.

If the mmshutdown -a / mmstartup -a recycle works, the problem should not 
recur with the current set of nodes in the cluster.  Even as individual 
GPFS daemons are recycled going forward, they should still understand the 
cluster's nodes have no common domain suffix.

However, I can imagine sequences of events that would cause the issue to 
occur again after nodes are deleted or added to the cluster while the 
cluster is active.  For example, if whale.img.cas.cz were to be deleted 
from the current cluster, that action would restore the cluster to having 
a common domain suffix of ".img.local", but already running GPFS daemons 
would not realize it.  If the delete of whale occurred while the cluster 
was active, subsequent recycling of the GPFS daemon on just a subset of 
the nodes would cause the recycled daemons to understand the common domain 
suffix to now be ".img.local".  But, daemons that had not been recycled 
would still think there is no common domain suffix.  The confusion would 
occur again.

On the other hand, adding and deleting nodes to/from the cluster should 
not cause the issue to occur again as long as the cluster continues to 
have the same (in this case, no) common domain suffix.

If you decide to delete whale.img.case.cz, rename it to have the 
".img.local" domain suffix, and add it back to the cluster, it would be 
best to do so after all the GPFS daemons are shut down with mmshutdown -a, 
but before any of the daemons are restarted with mmstartup.  This would 
allow all the subsequent running daemons to come to the conclusion that 
".img.local" is now the common domain suffix.

I hope this helps.

Regards,
Eric Agar

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     IBM Spectrum Scale <scale at us.ibm.com>
Date:   09/13/2017 03:42 AM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart


Hello
yes you are correct, Whale was added two days a go. It's necessary to 
delete whale.img.cas.cz from cluster before mmshutdown/mmstartup? If the 
two domains may cause problems in the future I can rename whale (and all 
planed nodes) to img.local suffix.
Many thanks for the prompt reply. 
Regards
Michal

Dne 12.9.2017 v 17:01 IBM Spectrum Scale napsal(a):
Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
 1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:        Michal Zacek <zacekm at img.cas.cz>
To:        gpfsug-discuss at spectrumscale.org
Date:        09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
  GPFS cluster name:         gpfscl1.img.local
  GPFS cluster id:           17792677515884116443
  GPFS UID domain:           img.local
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
   1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
   2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
   3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
   4   tau.img.local          192.168.1.248 tau.img.local
   5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
   6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
   8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
  192.168.20.63   gpfs-n3
  192.168.20.61   gpfs-n1
  192.168.20.62   gpfs-n2
  192.168.1.248   tau
  192.168.20.64   gpfs-n4.img.local
  192.168.20.60   gpfs-quorum.img.local
  147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 

Michal ???ek | Information Technologies 
+420 296 443 128 
+420 296 443 333 
michal.zacek at img.cas.cz 
www.img.cas.cz 
Institute of Molecular Genetics of the ASCR, v. v. i., V?de?sk? 1083, 142 
20 Prague 4, Czech Republic 
ID: 68378050 | VAT ID: CZ68378050 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1997 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment.png>

From valdis.kletnieks at vt.edu  Thu Sep 14 01:18:51 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Wed, 13 Sep 2017 20:18:51 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
Message-ID: <52657.1505348331@turing-police.cc.vt.edu>

So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?  	I could understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy that I
run that *does* have a big SHOW clause takes almost the same amount of time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?


From oehmes at gmail.com  Thu Sep 14 01:28:46 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Thu, 14 Sep 2017 00:28:46 +0000
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <CALssuR3RMwe3tqUtYSViZtc20Q5oJOPtw=QtYO+KgpGJCeNhTw@mail.gmail.com>

can you please share the entire command line you are using ?
also gpfs version, mmlsconfig output would help as well as if this is a
shared storage filesystem or a system using local disks.

thx. Sven

On Wed, Sep 13, 2017 at 5:19 PM <valdis.kletnieks at vt.edu> wrote:

> So we have a number of very similar policy files that get applied for file
> migration etc. And they vary drastically in the runtime to process,
> apparently
> due to different selections on whether to do the work in parallel.
>
> Running a set of rules with 'mmapplypolicy -I defer' that look like this:
>
> RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
> THRESHOLD(0,100,0)
> WEIGHT(FILE_SIZE)
> TO POOL 'VBI_FILES'
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
> evaluate the policy in 3 more minutes.
>
> However, this takes a bit over 30 minutes for the scan and another 20 for
> sorting and policy evaluation over the same set of filesets:
>
> RULE 'VBI_FILES_RULE' LIST 'pruned_files'
> THRESHOLD(90,80)
> WEIGHT(FILE_SIZE)
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> even though the output is essentially identical.  Why is LIST so much more
> expensive than 'MIGRATE" with '-I defer'?       I could understand if I
> had an
> expensive SHOW clause, but there isn't one here (and a different policy
> that I
> run that *does* have a big SHOW clause takes almost the same amount of
> time as
> the minimal LIST)....
>
> I'm thinking that it has *something* to do with the MIGRATE job outputting:
>
> [I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0
> files scanned.
> (...)
> [I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0
> records scanned.
>
> while the LIST job says:
>
> [I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
> (...)
> [I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.
>
> (Both output the same message during the 'Directory entries scanned: 0.'
> phase, but I suspect MIGRATE is multi-threading that part as well, as it
> completes much faster).
>
> What's the controlling factor in mmapplypolicy's decision whether or
> not to parallelize the policy?
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/7454f8c3/attachment.htm>

From kh.atmane at gmail.com  Thu Sep 14 13:49:55 2017
From: kh.atmane at gmail.com (atmane)
Date: Thu, 14 Sep 2017 13:49:55 +0100
Subject: [gpfsug-discuss] Disk change problem in gss GNR
Message-ID: <op.y6j29hzdpgw25x@pc-atm>

dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


-- 
-- 
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :  
a.khiredine at meteo.dz


From makaplan at us.ibm.com  Thu Sep 14 19:55:39 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 14:55:39 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>

Read the doc again.  Specify both -g and -N options on the command line to 
get fully parallel directory and inode/policy scanning.

I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ... 
Perhaps premigrate everything (that matches the other conditions)?

You are correct about
I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

If you don't see messages like that, you did not specify both -N and -g.


From:   valdis.kletnieks at vt.edu
To:     gpfsug-discuss at spectrumscale.org
Date:   09/13/2017 08:19 PM
Subject:        [gpfsug-discuss] mmapplypolicy run time weirdness..
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, 
apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?                I could 
understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy 
that I
run that *does* have a big SHOW clause takes almost the same amount of 
time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job 
outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=SGbwD3m5mZ16_vwIFK8Ym48lwdF1tVktnSao0a_tkfA&s=sLt9AtZiZ0qZCKzuQoQuyxN76_R66jfAwQxdIY-w2m0&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/fe03e5b6/attachment.htm>

From valdis.kletnieks at vt.edu  Thu Sep 14 21:09:40 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Thu, 14 Sep 2017 16:09:40 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
References: <52657.1505348331@turing-police.cc.vt.edu>
	<OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
Message-ID: <26551.1505419780@turing-police.cc.vt.edu>

On Thu, 14 Sep 2017 14:55:39 -0400, "Marc A Kaplan" said:

> Read the doc again.  Specify both -g and -N options on the command line to
> get fully parallel directory and inode/policy scanning.

Yeah, figured that out, with help from somebody. :)

> I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ...
> Perhaps premigrate everything (that matches the other conditions)?

Yeah, it's actually feeding to LTFS/EE - where we premigrate everything that matches to tape.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/a3a0be57/attachment.sig>

From makaplan at us.ibm.com  Thu Sep 14 22:13:59 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 17:13:59 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <26551.1505419780@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu><OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
	<26551.1505419780@turing-police.cc.vt.edu>
Message-ID: <OF2DE86920.DAFFA479-ON8525819B.00744BD2-8525819B.0074A50A@notes.na.collabserv.com>

BTW - we realize that mmapplypolicy -g and -N is a "gotcha" for some 
(many?) customer/admins -- so we're considering ways to make that easier 
-- but without "breaking" scripts and callbacks and what-have-yous that 
might depend on the current/old defaults...  Always a balancing act -- 
considering that GPFS ne Spectrum Scale just hit its 20th birthday (by IBM 
reckoning)

--marc of GPFS 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/5cd7446b/attachment.htm>

From neil.wilson at metoffice.gov.uk  Fri Sep 15 11:47:19 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Fri, 15 Sep 2017 10:47:19 +0000
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

Hi,

Does anyone know how to use "mmperfmon config update" to get the "hostname =" field in the ZImonSensors.cfg file populated with the hostname of the node that it's been installed on?
By default the field is empty and for some reason on our cluster it doesn't transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk<mailto:neil.wilson at metoffice.gov.uk>   Website www.metoffice.gov.uk<http://www.metoffice.gov.uk/>
Our magazine Barometer is now available online at http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7eede15d/attachment.htm>

From john.hearns at asml.com  Fri Sep 15 16:37:13 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 15 Sep 2017 15:37:13 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
Message-ID: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7cb2030f/attachment.htm>

From Greg.Lehmann at csiro.au  Mon Sep 18 01:14:52 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Mon, 18 Sep 2017 00:14:52 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
In-Reply-To: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <5d1811f4d6ad4605bd2a7c7441f4dd1b@exch1-cdc.nexus.csiro.au>

I am interested too, so maybe keep it on list?

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: Saturday, 16 September 2017 1:37 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/78f2c389/attachment.htm>

From richard.lefebvre+gpfsug at calculquebec.ca  Mon Sep 18 20:16:57 2017
From: richard.lefebvre+gpfsug at calculquebec.ca (Richard Lefebvre)
Date: Mon, 18 Sep 2017 15:16:57 -0400
Subject: [gpfsug-discuss] How to find which node is generating high iops in
	a GPFS 3.5
Message-ID: <CAHuHHxpZNLFObOnBA06+vMSBzzh9gdGoZdBNTFLXx83+iqSeyw@mail.gmail.com>

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/e8ed6148/attachment.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 18 20:27:49 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 18 Sep 2017 19:27:49 +0000
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
Message-ID: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>

You do realize 3.5 is out of service, correct? You should be looking at upgrading :-)

Catching this is real time, when you have a large number of nodes is going to be tough. How you recognizing that the file system is overloaded? Waiters? Looking at which nodes/NSDs have the longest/largest waiters may provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its raw state, but it does provide some good stats on a per file system basis. But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that generate a lot of iops on the large file system but I cannot find the right tool to find which node is the source. I'm guessing under 4.2.X, there are now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/3862d74b/attachment.htm>

From scale at us.ibm.com  Tue Sep 19 07:47:42 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:47:42 +0800
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF1C1CA31D.FDBB5467-ON482581A0.002535A5-482581A0.002553BB@notes.na.collabserv.com>

Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/15/2017 06:48 PM
Subject:	[gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk   Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment.gif>

From scale at us.ibm.com  Tue Sep 19 07:54:50 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:54:50 +0800
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
In-Reply-To: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
References: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
Message-ID: <OF9FB0FAC8.E6E8D3A0-ON482581A0.0025AE68-482581A0.0025FAF5@notes.na.collabserv.com>

Hi Richard,

Is any of tool in
https://www.ibm.com/developerworks/community/wikis/home?_escaped_fragment_=/wiki/General%2520Parallel%2520File%2520System%2520%2528GPFS%2529/page/Display%2520per%2520node%2520IO%2520statstics
 can help you?

BTW, I agree with Bob that 3.5 is out-of-service. Without an extended
service, you should consider to upgrade your cluster as soon as possible.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/19/2017 03:28 AM
Subject:	Re: [gpfsug-discuss] How to find which node is generating high
            iops in a GPFS 3.5
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


You do realize 3.5 is out of service, correct? You should be looking at
upgrading :-)

Catching this is real time, when you have a large number of nodes is going
to be tough. How you recognizing that the file system is overloaded?
Waiters? Looking at which nodes/NSDs have the longest/largest waiters may
provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its
raw state, but it does provide some good stats on a per file system basis.
But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard
Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating
high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=AYwUf61wv-Hq63KU7veQSxavdZy-e9eT9bkJFav8MVU&s=W42AQE74bvmOlw7P0D0wTqT0Rxop4KktnXeuDeGGdmk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment.gif>

From rohwedder at de.ibm.com  Tue Sep 19 08:42:46 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Tue, 19 Sep 2017 09:42:46 +0200
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
	<OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
Message-ID: <OFB8D4EC66.B50F54BE-ON002581A0.002805B4-C12581A0.002A5E66@notes.na.collabserv.com>

Hello Neil,

While the description below provides a way on how to edit the hostname
parameter, you should not have the need to edit the "hostname" parameter.
Sensors use the hostname() call to get the hostname where the sensor is
running and use this as key in the performance database, which is what you
typically want to see.

From the description you provide I assume you want to have a sensor running
on every node that has the perfmon designation?
There could be different issues:
> In order to enable sensors on every node, you need to ensure there is no
"restrict" clause in the sensor description, or the restrict clause has to
be set correctly
> There could be some other communication issue between sensors and
collectors.
   Restart sensors and collectors and check the  logfiles
in /var/log/zimon/. You should be able to see which sensors start up and if
they can connect.
> Can you check if you have the perfmon designation set for the nodes where
you expect data from (mmlscluster)

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	"IBM Spectrum Scale" <scale at us.ibm.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc:	gpfsug-discuss-bounces at spectrumscale.org
Date:	09/19/2017 08:48 AM
Subject:	Re: [gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.

Inactive hide details for "Wilson, Neil" ---09/15/2017 06:48:26 PM---Hi,
Does anyone know how to use "mmperfmon config update" "Wilson, Neil"
---09/15/2017 06:48:26 PM---Hi, Does anyone know how to use "mmperfmon
config update" to get the "hostname =" field in the ZImon

From: "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 09/15/2017 06:48 PM
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson Senior IT Practitioner
Storage Team IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=Ow2bpnoab1kboH2xuSUrbx65ALeoAAicG7csl1sV-Qc&s=qZ1XUXWfOayLSSuvcCyHQ2ZgY1mu0Zs3kmpgeVQUCYI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1D696444.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0002.gif>

From mnaineni at in.ibm.com  Tue Sep 19 12:50:50 2017
From: mnaineni at in.ibm.com (Malahal R Naineni)
Date: Tue, 19 Sep 2017 11:50:50 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
	(Greg.Lehmann@csiro.au)
Message-ID: <OF60FA9DF0.C260AD81-ON002581A0.0040CA6E-002581A0.00411458@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f1d587f4/attachment.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Tue Sep 19 22:02:03 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Tue, 19 Sep 2017 21:02:03 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/c6789dd1/attachment.htm>

From Robert.Oesterlin at nuance.com  Wed Sep 20 00:39:37 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 19 Sep 2017 23:39:37 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>

OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f4a15d88/attachment.htm>

From bevans at pixitmedia.com  Wed Sep 20 02:21:36 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Tue, 19 Sep 2017 18:21:36 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
Message-ID: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>

Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced from
1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2 *
nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
with GID c081f9feff078a26. Please check if the correct inet6 address for
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same
way and no dice. Have tried a variety of drivers, cards, etc, same result
every time.

Cheers,
Barry

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/bc9116c8/attachment.htm>

From scale at us.ibm.com  Wed Sep 20 04:07:18 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:07:18 +0800
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
Message-ID: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>

Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot
be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:


1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/20/2017 07:39 AM
Subject:	Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the
cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down,
bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh,
Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while
back.  I did nothing about it at the time because it was due to be
life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.
I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.
I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes
files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from
testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above,
networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine
cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and
Education
Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment.gif>

From scale at us.ibm.com  Wed Sep 20 04:33:16 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:33:16 +0800
Subject: [gpfsug-discuss] Disk change problem in gss GNR
In-Reply-To: <op.y6j29hzdpgw25x@pc-atm>
References: <op.y6j29hzdpgw25x@pc-atm>
Message-ID: <OF81362FD2.A45C682D-ON482581A1.0012A91B-482581A1.0013867A@notes.na.collabserv.com>


Hi Atmane,

In terms of this kind of disk management question, I would like to suggest
to open a PMR to make IBM service help you.

mmdelpdisk command would not need to reboot system to take effect.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	atmane <kh.atmane at gmail.com>
To:	"gpfsug-discuss at spectrumscale.org"
            <gpfsug-discuss at spectrumscale.org>
Date:	09/14/2017 08:50 PM
Subject:	[gpfsug-discuss] Disk change problem in gss GNR
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


--
--
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :
a.khiredine at meteo.dz
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFbA&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=hQ86ctTaI7i14NrB-58_SzqSWnCR8p6b5bFxtzNcSbk&s=mthjH7ebhnNlSJl71hFjF4wZU0iygm3I9wH_Bu7_3Ds&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment.gif>

From olaf.weiser at de.ibm.com  Wed Sep 20 06:00:49 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 20 Sep 2017 07:00:49 +0200
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
Message-ID: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/c94a9646/attachment.htm>

From jonathon.anderson at colorado.edu  Wed Sep 20 06:13:13 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:13:13 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
Message-ID: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From jonathon.anderson at colorado.edu  Wed Sep 20 06:33:14 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:33:14 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>,
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <BN3PR03MB1382D3400D91A1B2448E920480610@BN3PR03MB1382.namprd03.prod.outlook.com>

I should have said, here are the package versions:


[root at sgate1 ~]# rpm -qa | grep gpfs
gpfs.gpl-4.2.2-3.noarch
gpfs.docs-4.2.2-3.noarch
gpfs.base-4.2.2-3.x86_64
gpfs.gplbin-3.10.0-514.26.2.el7.x86_64-4.2.2-3.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm32_2.el7.x86_64
gpfs.ext-4.2.2-3.x86_64
gpfs.msg.en_US-4.2.2-3.noarch
gpfs.gskit-8.0.50-57.x86_64
gpfs.gplbin-3.10.0-327.36.3.el7.x86_64-4.2.2-3.x86_64


________________________________________
From: Jonathon A Anderson
Sent: Tuesday, September 19, 2017 11:13:13 PM
To: gpfsug main discussion list
Cc: varun.mittal at in.ibm.com; Mark.Bush at siriuscom.com
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From gangqiu at cn.ibm.com  Wed Sep 20 06:58:15 2017
From: gangqiu at cn.ibm.com (Gang Qiu)
Date: Wed, 20 Sep 2017 13:58:15 +0800
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
Message-ID: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>

 Do you set ip address for these adapters?

Refer to the description of verbsRdmaCm in ?Command and Programming 
Reference':

If RDMA CM is enabled for a node, the node will only be able to establish 
RDMA connections
using RDMA CM to other nodes with verbsRdmaCm enabled. RDMA CM enablement 
requires
IPoIB (IP over InfiniBand) with an active IP address for each port. 
Although IPv6 must be
enabled, the GPFS implementation of RDMA CM does not currently support 
IPv6 addresses, so
an IPv4 address must be used.


Regards,
Gang Qiu

********************************************************************************************** 

IBM China Systems & Technology Lab
Tel:   86-10-82452193
Fax:   86-10-82452312
Moble: 132-6134-8284
Email:  gangqiu at cn.ibm.com
Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8 
Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193, 
P.R.China
??????????????8???????28???????????100193
**********************************************************************************************


From:   "Olaf Weiser" <olaf.weiser at de.ibm.com>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/20/2017 01:01 PM
Subject:        Re: [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


is ib_read_bw  working  ?
just test it between the two nodes ... 


From:        Barry Evans <bevans at pixitmedia.com>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        09/20/2017 03:21 AM
Subject:        [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is 
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes 
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes 
verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version 
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced 
from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 
2 * nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort 
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0 
with GID c081f9feff078a26. Please check if the correct inet6 address for 
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so 
unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid 
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same 
way and no dice. Have tried a variety of drivers, cards, etc, same result 
every time.

Cheers,
Barry


This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email._______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/54161a6e/attachment.htm>

From tortay at cc.in2p3.fr  Wed Sep 20 09:03:54 2017
From: tortay at cc.in2p3.fr (Loic Tortay)
Date: Wed, 20 Sep 2017 10:03:54 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
References: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
Message-ID: <853ffcf7-7900-457b-0d8a-2c63886ed245@cc.in2p3.fr>

On 19/09/2017 23:02, Buterbaugh, Kevin L wrote:
> Hi All,
> 
> We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?
> 
Hello,
I have had the same issue multiple times.

The "trick" is to execute "/usr/lpp/mmfs/bin/mmcommon startCcrMonitor"
on a majority of quorum nodes (once they have the correct configuration
files) to be able to start the cluster.

I noticed a call to the above command in the "gpfs.gplbin" spec file in
the "%postun" section (when doing RPM upgrades, if I'm not mistaken).

<Insert here rant about CCR design & testing>.


Lo?c.
-- 
|   Lo?c Tortay <tortay at cc.in2p3.fr>  -     IN2P3 Computing Centre     |


From r.sobey at imperial.ac.uk  Wed Sep 20 09:23:37 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 20 Sep 2017 08:23:37 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for 
authentication. In that case the SMB service includes two aspects: One is 
the actual SMB file server, and the second one is the service for the 
Active Directory integration. Since NFS depends on authentication and id 
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From douglasof at us.ibm.com  Wed Sep 20 09:28:44 2017
From: douglasof at us.ibm.com (Douglas O'flaherty)
Date: Wed, 20 Sep 2017 08:28:44 +0000
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
Message-ID: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>

Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
 
   
    For more information 
    http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/
    
    Doug
  
  Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/2244bfe7/attachment.htm>

From ckrafft at de.ibm.com  Wed Sep 20 11:47:35 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Wed, 20 Sep 2017 12:47:35 +0200
Subject: [gpfsug-discuss] WANTED: Official support statement using Spectrum
 Scale 4.2.x with Oracle DB v12
Message-ID: <OF5272FF61.18327082-ON002581A1.003B0785-C12581A1.003B49D6@notes.na.collabserv.com>


Hi folks,

is anyone aware if there is now an official support statement for Spectrum
Scale 4.2.x?

As far as my understanding goes - we currently have an "older" official
support statement for v4.1 with Oracle.

Many thanks up-front for any useful hints ... :)


Mit freundlichen Gr??en / Sincerely

Christoph Krafft

Client Technical Specialist - Power Systems, IBM Systems
Certified IT Specialist @ The Open Group
                                                                                                               
                                                                                                               
 Phone:            +49 (0) 7034 643 2171                     IBM Deutschland GmbH                              
                                                                                                               
 Mobile:           +49 (0) 160 97 81 86 12                   Am Weiher 24                                      
                                                                                                               
 Email:            ckrafft at de.ibm.com                        65451 Kelsterbach                                 
                                                                                                               
                                                             Germany                                           
                                                                                                               
                                                                                                               
 IBM Deutschland                                                                                               
 GmbH /                                                                                                        
 Vorsitzender des                                                                                              
 Aufsichtsrats:                                                                                                
 Martin Jetter                                                                                                 
 Gesch?ftsf?hrung:                                                                                             
 Martina Koederitz                                                                                             
 (Vorsitzende),                                                                                                
 Norbert Janzen,                                                                                               
 Stefan Lutz,                                                                                                  
 Nicole Reimer,                                                                                                
 Dr. Klaus                                                                                                     
 Seifert, Wolfgang                                                                                             
 Wendt                                                                                                         
 Sitz der                                                                                                      
 Gesellschaft:                                                                                                 
 Ehningen /                                                                                                    
 Registergericht:                                                                                              
 Amtsgericht                                                                                                   
 Stuttgart, HRB                                                                                                
 14562 /                                                                                                       
 WEEE-Reg.-Nr. DE                                                                                              
 99369940                                                                                                      
                                                                                                               

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 15225079.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0001.gif>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 14:55:28 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 13:55:28 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
Message-ID: <FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD respectively).  Given that they were 12 year old boxes, we decided to replace them with other boxes that are a mere 7 years old ? keep in mind that this is a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.
mmchcluster: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that were removed from the cluster without removing them from the cluster (done by another SysAdmin who was in a hurry to repurpose those machines).  They?re up and pingable but not reachable by GPFS anymore, which I?m pretty sure is making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale <scale at us.ibm.com<mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node died here. Are you mean that there are some hardware/OS issue which cannot be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries.

The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and

From: "Oesterlin, Robert" <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM
Subject: Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while back. I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3. In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/754a6e51/attachment.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:17:34 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:17:34 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
	<OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
Message-ID: <CAE6+Ly5VppqizgNwC+niq1uwKJNhC9bSJMEwt9MqW28mYfqjEg@mail.gmail.com>

Yep, IP's set ok. We did try with ipv6 off to see what would happen, then
turned it back on again. There are ipv6 addresses on the cards, but ipv4 is
the only thing actually being used.


On Tue, Sep 19, 2017 at 10:58 PM, Gang Qiu <gangqiu at cn.ibm.com> wrote:

>
>
>
> Do you set ip address for these adapters?
>
> Refer to the description of verbsRdmaCm in ?Command and Programming
> Reference':
>
> If RDMA CM is enabled for a node, the node will only be able to establish
> RDMA connections
> using RDMA CM to other nodes with *verbsRdmaCm *enabled. RDMA CM
> enablement requires
> IPoIB (IP over InfiniBand) with an active IP address for each port.
> Although IPv6 must be
> enabled, the GPFS implementation of RDMA CM does not currently support
> IPv6 addresses, so
> an IPv4 address must be used.
>
>
>
> Regards,
> Gang Qiu
>
> ************************************************************
> **********************************
> IBM China Systems & Technology Lab
> Tel:   86-10-82452193
> Fax:   86-10-82452312
> Moble: 132-6134-8284
> Email:  gangqiu at cn.ibm.com
> Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8
> Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193,
> P.R.China
> ??????????????8???????28???????????100193
> ************************************************************
> **********************************
>
>
>
> From:        "Olaf Weiser" <olaf.weiser at de.ibm.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 01:01 PM
> Subject:        Re: [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
>
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__pixitmedia.com_&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=Cuqio6URV5SlrAbObWAcbPH081odzTfHQKwjrXCoG60&e=>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.
> org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=
> NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=
> u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=
> 63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/8962053d/attachment.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:23:21 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:23:21 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
Message-ID: <CAE6+Ly69-NAHycT0+7YkMjcVB1hTRqO-gtMvn0FdUa7HpFfCuQ@mail.gmail.com>

It has worked, yes, and while the issue has been present. At the moment
it's not working, but I'm not entirely surprised with the amount it's been
poked at.

Cheers,
Barry

On Tue, Sep 19, 2017 at 10:00 PM, Olaf Weiser <olaf.weiser at de.ibm.com>
wrote:

> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
> <http://pixitmedia.com/>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/7e91a2b9/attachment.htm>

From kkr at lbl.gov  Wed Sep 20 17:00:15 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Wed, 20 Sep 2017 09:00:15 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
Message-ID: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>

Thanks Doug. 

If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.

Cheers,
Kristy

> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com> wrote:
> 
> 
> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
> 
> 
> For more information 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> Doug
> 
> Mobile
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/0f73131c/attachment.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 17:27:48 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 16:27:48 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920114844.6bf9f27b@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
Message-ID: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
testdellnode1:  total 12
testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testgateway:  total 12
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
testnsd3:  total 8
testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12
/var/mmfs/gen
root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
/var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
remote shell process had return code 255. testnsd1.vampire:  Host key
verification failed. mmdsh: testnsd1.vampire remote shell process had return
code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/3a626e67/attachment.htm>

From stijn.deweirdt at ugent.be  Wed Sep 20 18:48:26 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 20 Sep 2017 19:48:26 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <1f0b2657-8ca3-7b35-95f3-7c4edb6c0818@ugent.be>

hi kevin,

we were hit by similar issue when we did something not so smart: we had
a 5 node quorum, and we wanted to replace 1 test node with 3 more
production quorum node. we however first removed the test node, and then
with 4 quorum nodes we did mmshutdown for some other config
modifications. when we tried to start it, we hit the same "Not enough
CCR quorum nodes available" errors.

also, none of the ccr commands were helpful; they also hanged, even
simple ones like show etc etc.

what we did in the end was the following (and some try-and-error):

from the /var/adm/ras/mmsdrserv.log logfiles we guessed that we had some
sort of split brain paxos cluster (some reported " ccrd: recovery
complete (rc 809)", some same message with 'rc 0' and some didn't have
the recovery complete on the last line(s))

* stop ccr everywhere
mmshutdown -a
mmdsh -N all pkill -9 -f mmccr

* one by one, start the paxos cluster using mmshutdown on the quorum
nodes (mmshutdown will start ccr and there is no unit or something to
help with that).
 * the nodes will join after 3-4 minutes and report "recovery complete";
wait for it before you start another one

* the trial-and-error part was that sometimes there was recovery
complete with rc=809, sometimes with rc=0. in the end, once they all had
same rc=0, paxos was happy again and eg mmlsconfig worked again.


this left a very bad experience with CCR with us, but we want to use
ces, so no real alternative (and to be honest, with odd number of
quorum, we saw no more issues, everyting was smooth).

in particular we were missing
* unit files for all extra services that gpfs launched (mmccrmoniotr,
mmsysmon); so we can monitor and start/stop them cleanly
* ccr commands that work with broken paxos setup; eg to report that the
paxos cluster is broken or operating in some split-brain mode.

anyway, YMMV and good luck.

stijn


On 09/20/2017 06:27 PM, Buterbaugh, Kevin L wrote:
> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
> testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
> testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
> testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
> testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
> testdellnode1:  total 12
> testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testgateway:  total 12
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
> testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
> testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
> testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
> testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
> testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
> testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
> testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
> testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
> testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
> testnsd3:  total 8
> testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12
> /var/mmfs/gen
> root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
> testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
> remote shell process had return code 255. testnsd1.vampire:  Host key
> verification failed. mmdsh: testnsd1.vampire remote shell process had return
> code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
> main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

From jonathon.anderson at colorado.edu  Wed Sep 20 19:55:04 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 18:55:04 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


From ewahl at osc.edu  Wed Sep 20 20:07:39 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Wed, 20 Sep 2017 15:07:39 -0400
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <20170920150739.39f0a4a0@osc.edu>


So who was the ccrmaster before? 
What is/was the quorum config?  (tiebreaker disks?) 

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu> wrote:

> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after
> Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
> including some output below which seems to show that I?ve got everything set
> up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this
> experience doesn?t make me eager to do so!), so I?m not that familiar with
> it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
> grep" | sort testdellnode1:  root      2583     1  0 May30 ?
> 00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?
> 00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root     19356  4628  0 11:19 tty1
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root      4628     1  0 Sep19 tty1
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root     22149  2983  0 11:16 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root      2983     1  0 Sep18 ?
> 00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root     15685  6557  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root      6557     1  0 Sep19 ?
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?
> 00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
> 15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
> quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
> fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
> error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
> -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
> drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
> root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
> root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
> drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
> 2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
> 99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
> root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
> committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
> -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
> drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
> root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
> 4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
> 2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
> -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
> root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
> 2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
> Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
> cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
> 17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames
> "md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl
> <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: testnsd3.vampire:  Host key verification failed. mmdsh:
> testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
> Host key verification failed. mmdsh: testnsd1.vampire remote shell process
> had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
> To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
> - (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> 


-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From tarak.patel at canada.ca  Wed Sep 20 21:23:00 2017
From: tarak.patel at canada.ca (Patel, Tarak (SSC/SPC))
Date: Wed, 20 Sep 2017 20:23:00 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <mailman.5.1653257239.13061.gpfsug-discuss_gpfsug.org@gpfsug.org>

Hi,

Recently we deployed 3 sets of CES nodes where we are using LDAP for authentication service. We had to create a user in ldap which was used by 'mmuserauth service create' command.  Note that SMB needs to be disabled ('mmces service disable smb') if not being used before issuing 'mmuserauth service create'.  By default, CES deployment enables SMB (' spectrumscale config protocols').

Tarak

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September, 2017 14:55
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB service not enabled.
I am currently working on a spectrum scale cluster, and i dont have the SMB package, I am waiting for it.. is there a way to export NFSv3 using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works fine, I was able to mount the gpfs export from another machine.. this was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and 
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "* 
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but not 
> for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the 
> NFS client tells you".  This of course only works sanely if each NFS 
> export is only to a set of machines in the same administrative domain 
> that manages their UID/GIDs.  Exporting to two sets of machines that 
> don't coordinate their UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpi
> Bv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiy
> liSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ
> 0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGV
> srSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwC
> YeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbj
> XI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuv
> EeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discus
> s
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From chetkulk at in.ibm.com  Thu Sep 21 06:33:53 2017
From: chetkulk at in.ibm.com (Chetan R Kulkarni)
Date: Thu, 21 Sep 2017 11:03:53 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu><CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com><OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com><BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <OF1A4EBB73.EFDC10D4-ON652581A2.001E1BDB-652581A2.001E91B1@notes.na.collabserv.com>


Hi Jonathon,

I can configure file userdefined authentication with only NFS
enabled/running on my test setup (SMB was disabled).

Please check if following steps help fix your issue:

1> remove existing file auth if any
/usr/lpp/mmfs/bin/mmuserauth service remove --data-access-method file

2> disable smb service
/usr/lpp/mmfs/bin/mmces service disable smb
/usr/lpp/mmfs/bin/mmces service list -a

3> configure userdefined file auth
/usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined

4> if above fails retry mmuserauth in debug mode as below and please share
error log /tmp/userdefined.log. Also share spectrum scale version you are
running with.
export DEBUG=1; /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined > /tmp/userdefined.log 2>&1;
unset DEBUG
/usr/lpp/mmfs/bin/mmdiag --version

5> if mmuserauth succeeds in step 3> above; you also need to correct your
mmnfs cli command as below. You missed to type in Access_Type= and Squash=
in client definition.
mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash);dtn*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash)'

Thanks,
Chetan.


From:	Jonathon A Anderson <jonathon.anderson at colorado.edu>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/21/2017 12:25 AM
Subject:	Re: [gpfsug-discuss] export nfs share on gpfs with no
            authentication
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3
<varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did
you try to run this command from a CES designated nod]Varun
Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from
a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi,
While trying to add the userdefined auth, I receive err]Ilan Schwarts
---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I
receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export
is
> only to a set of machines in the same administrative domain that manages
their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=AliY037R_W1y8Ym6nPI1XDP2yCq47JwtTPhj9IppwOM&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment.gif>

From andreas.mattsson at maxiv.lu.se  Thu Sep 21 13:09:29 2017
From: andreas.mattsson at maxiv.lu.se (Andreas Mattsson)
Date: Thu, 21 Sep 2017 12:09:29 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
References: <accef64e0cde48968aeca7cb9883112a@maxiv.lu.se>
	<EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org>
	<bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>

Since I solved this old issue a long time ago, I'd thought I'd come back and report the solution in case someone else encounters similar problems in the future.


Original problem reported by users:

Copying files between folders on NFS exports from a CES server gave random timestamps on the files.  Also, apart from the initial reported problem, there where issues where users sometimes couldn't change or delete files that they where owners of.


Background:

We have a Active Directory with RFC2307 posix attributes populated, and use the built in Winbind-based AD authentication with RFC2307 ID mapping of our Spectrum Scale CES protocol servers.

All our Linux clients and servers are also AD integrated, using Nslcd and nss-pam-ldapd.


Trigger:

If a user was part of a AD group with a mixed case name, and this group gave access to a folder, and the NFS mount was done using NFSv4, the behavior in my original post occurred when copying or changing files in that folder.


Cause:

Active Directory handle LDAP-requests case insensitive, but results are returned with case retained.

Winbind and SSSD-AD converts groups and usernames to lower case. Nslcd retains case.

We run NFS with managed GIDs. Managed GIDs in NFSv3 seems to be handled case insensitive, or to ignore the actual group name after it has resolved the GID-number of the group, while NFSv4 seems to handle group names case sensitive and check the actual group name for certain operations even if the GID-number matches.

Don't fully understand the mechanism behind why certain file operations would work but others not, but in essence a user would be part of a group called "UserGroup" with GID-number 1234 in AD and on the client, but would be part of a group called "usergroup" with GID-number 1234 on the CES server.

Any operation that's authorized on the GID-number, or a case insensitive lookup of the group name, would work. Any operation authorized by a case sensitive group lookup would fail.


Three different workarounds where found to work:

1. Rename groups and users to lower case in AD

2. Change from Nslcd to either SSSD or Winbind on the clients

3. Change from NFSv4 to NFSv3 when mounting NFS


Remember to clear ID-mapping caches.


Regards,

Andreas

___________________________________
[https://mail.google.com/mail/u/0/?ui=2&ik=b0a6f02971&view=att&th=14618fab2daf0e10&attid=0.1.1&disp=emb&zw&atsh=1]<https://www.maxlab.lu.se>

Andreas Mattsson
System Engineer

MAX IV Laboratory
Lund University
Tel: +46-706-649544<tel:%2B46-706-649544>
E-mail: andreas.mattsson at maxlab.lu.se<mailto:andreas.mattsson at maxlab.lu.se>
<mailto:daniel.liikamaa at maxlab.lu.se>
________________________________
Fr?n: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> f?r Stephen Ulmer <ulmer at ulmer.org>
Skickat: den 3 februari 2017 14:35:21
Till: gpfsug main discussion list
?mne: Re: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES

Does the cp actually complete? As in, does it copy all of the blocks?  What?s the exit code?

A cp?d file should have  ?new? metadata. That is, it should have it?s own dates, owners, etc. (not necessarily copied from the source file).

I ran ?strace cp foo1 foo2?, and it was pretty instructive, maybe that would get you more info. On CentOS strace is in it?s own package, YMMV.

--
Stephen


On Feb 3, 2017, at 8:19 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

That works.

?touch test100?

Feb 3 14:16 test100

?cp test100 test101?

Feb 3 14:16 test100
Apr 21 2027 test101

?touch ?r test100 test101?

Feb 3 14:16 test100
Feb 3 14:16 test101

/Andreas


That?s a cool one. :)

What if you use the "random date" file as a time reference to touch another file (like, 'touch -r file02 file03?)?

--
Stephen


On Feb 3, 2017, at 7:46 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

I?m having some really strange timestamp behaviour when doing file operations on NFS mounts shared via CES on spectrum scale 4.2.1.1
The NFS clients are up to date Centos and Debian machines.
All Scale servers and NFS clients have correct date and time via NTP.

Creating a file, for instance ?touch file00?, gives correct timestamp.
Moving the file, ?mv file00 file01?, gives correct timestamp
Copying the file, ?cp file01 file02?, gives a random timestamp anywhere in time, for instance Oct 12 2095 or Feb 29 1976 or something similar.

This is only via NFS. Copying the file via a native gpfs-mount or via SMB gives a correct timestamp.
Doing the same operation over NFS to other NFS-servers works correct, it is only when operating on the NFS-share from the Spectrum Scale CES the issue occurs.

Have anyone seen this before?

Regards,
Andreas Mattsson
_____________________________________________
<image001.png>

Andreas Mattsson
Systems Engineer

MAX IV Laboratory
Lund University
P.O. Box 118, SE-221 00 Lund, Sweden
Visiting address: Fotongatan 2, 225 94 Lund
Mobile: +46 706 64 95 44
andreas.mattsson at maxiv.se<mailto:andreas.mattsson at maxiv.se>
www.maxiv.se<http://www.maxiv.se/>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/fbe5e837/attachment.htm>

From taylorm at us.ibm.com  Thu Sep 21 15:33:00 2017
From: taylorm at us.ibm.com (Michael L Taylor)
Date: Thu, 21 Sep 2017 07:33:00 -0700
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFC2F832AE.B5685939-ON002581A1.008268B6-072581A2.004FECF6@notes.na.collabserv.com>


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com:  NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/20/2017 12:07 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 42
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: export nfs share on gpfs with no authentication
      (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
		 authentication
Message-ID:

<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/4ab85c21/attachment.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Thu Sep 21 18:09:52 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Thu, 21 Sep 2017 17:09:52 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920150739.39f0a4a0@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
	<20170920150739.39f0a4a0@osc.edu>
Message-ID: <A44350B7-4CEF-497A-9D41-0C1A96B0F103@vanderbilt.edu>

Hi All,

Ralf Eberhard of IBM helped me resolve this off list.  The key was to temporarily make testnsd1 and testnsd3 not be quorum nodes by making sure GPFS was down and then executing:

mmchnode --nonquorum -N testnsd1,testnsd3 --force

That gave me some scary messages about overriding normal GPFS quorum semantics, but nce that was done I was able to run an ?mmstartup -a? and bring up the cluster!  Once it was up and I had verified things were working properly I then shut it back down so that I could rerun the mmchnode (without the ?force) to make testnsd1 and testnsd3 quorum nodes again.

Thanks to all who helped me out here?

Kevin

On Sep 20, 2017, at 2:07 PM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:


So who was the ccrmaster before?
What is/was the quorum config?  (tiebreaker disks?)

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after
Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
including some output below which seems to show that I?ve got everything set
up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this
experience doesn?t make me eager to do so!), so I?m not that familiar with
it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
grep" | sort testdellnode1:  root      2583     1  0 May30 ?
00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?
00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root     19356  4628  0 11:19 tty1
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root      4628     1  0 Sep19 tty1
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root     22149  2983  0 11:16 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root      2983     1  0 Sep18 ?
00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root     15685  6557  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root      6557     1  0 Sep19 ?
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?
00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
-F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
-rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
-rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames
"md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl
<ewahl at osc.edu<mailto:ewahl at osc.edu><mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: testnsd3.vampire:  Host key verification failed. mmdsh:
testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
Host key verification failed. mmdsh: testnsd1.vampire remote shell process
had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
- (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=d0MIeC47FlVIyiWVgLm%2FmvIKWJYwHVR2Kp9oMAPrtgM%3D&reserved=0><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=Z59ik0w%2BaK6bV2JsDxSNt%2FsqwR1ESuqkXTQVBlRjDgw%3D&reserved=0


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/1109ba42/attachment.htm>

From kkr at lbl.gov  Thu Sep 21 19:49:29 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Thu, 21 Sep 2017 11:49:29 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
	<D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
Message-ID: <CB28D3BC-65C6-43F7-B8BC-2E88E99A2573@lbl.gov>

Registration space is getting tight. We decided on a room reconfiguration today to make a little more room. So if you tried to register and were told it was full try again. If it fills up again and you want to register, but can?t drop me an email and I?ll see what we can do.

Best,
Kristy

> On Sep 20, 2017, at 9:00 AM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> Thanks Doug. 
> 
> If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.
> 
> Cheers,
> Kristy
> 
>> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com <mailto:douglasof at us.ibm.com>> wrote:
>> 
>> 
>> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
>> 
>> 
>> For more information 
>> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
>> 
>> Doug
>> 
>> Mobile
>> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/d1f7b641/attachment.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:08:58 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:08:58 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>
References: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF8B6E6F9D.A72975C0-ON002581A3.0078D4A8-002581A3.0079AC00@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/8ea3693d/attachment.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:10:45 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:10:45 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>
References: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>,
	<10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF3932E8C8.E1CBE95E-ON002581A3.0079CED5-002581A3.0079D5BB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/e1751905/attachment.htm>

From bipcuds at gmail.com  Sun Sep 24 19:04:59 2017
From: bipcuds at gmail.com (Keith Ball)
Date: Sun, 24 Sep 2017 14:04:59 -0400
Subject: [gpfsug-discuss] Experience with zimon database stability,
	and best practices for backup?
Message-ID: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/b2d6a044/attachment.htm>

From kkr at lbl.gov  Sun Sep 24 20:29:10 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Sun, 24 Sep 2017 12:29:10 -0700
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
In-Reply-To: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
References: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
Message-ID: <CAA9oNusSp5HDYDaCCWs5jVQv6M2m15kkMu8omekS0zc8nqwiTA@mail.gmail.com>

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up
against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I
have been thinking about. I'm interested partly because of the granularity
that is lost over time like with any round robin style data collection
scheme.

So I guess one question is whether you have considered pulling the data out
to another database, looked at the SS GUI which uses a postgres db (iirc,
about to take off on a flight and can't check), or looked at the Grafana
bridge which would get data into OpenTsdb format, again iirc. Anyway, just
some things for consideration and a request to share back whatever you find
out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com> wrote:

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/60dead5b/attachment.htm>

From rkomandu at in.ibm.com  Mon Sep 25 06:26:15 2017
From: rkomandu at in.ibm.com (Ravi K Komanduri)
Date: Mon, 25 Sep 2017 10:56:15 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
	<OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
Message-ID: <OF93AA3932.35C926C1-ON652581A6.001C6917-652581A6.001DDE07@notes.na.collabserv.com>

Jonathon,

This requires SMB service when you are at 422 PTF2. As Mike pointed out if 
you upgrade to the 4.2.3-3/4 build you will no longer hit that issue 


With Regards,
Ravi K Komanduri
Email:rkomandu at in.ibm.com


From:   "Michael L Taylor" <taylorm at us.ibm.com>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/21/2017 08:03 PM
Subject:        Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest 
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com: NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file 
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From: gpfsug-discuss-request at spectrumscale.org
To: gpfsug-discuss at spectrumscale.org
Date: 09/20/2017 12:07 PM
Subject: gpfsug-discuss Digest, Vol 68, Issue 42
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e= 

or, via email, send a message with subject or body 'help' to
gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

  1. Re: export nfs share on gpfs with no authentication
     (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Message-ID:
<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>

Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined 
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org 
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A 
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've 
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A 
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a 
blocking bugfix applied. I'm making it further now, but I'm getting an 
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication 
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to 
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB 
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create 
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to 
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=ilYETqcaNr1y1ulWWDPjVg_X9pt35O1eYBTyFwJP56Y&m=VW8gJLSqT4rru6lFZXxCFp-Y3ngi6IUydv5czoG8kTE&s=deIQZQr-qfqLqW377yNysTJI8y7QJOdbokVjlnDr2d8&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/e2ed42ba/attachment.htm>

From john.hearns at asml.com  Mon Sep 25 08:40:34 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 25 Sep 2017 07:40:34 +0000
Subject: [gpfsug-discuss] SPectrum Scale on AWS
Message-ID: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>

I guess this is not news on this list, however I did see a reference to SpectrumScale  on The Register this morning,
which linked to this paper:
https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf

The article is here https://www.theregister.co.uk/2017/09/25/storage_super_club_sandwich/
12 Terabyte Helium drives now available.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/2d252a2d/attachment.htm>

From mikeowen at thinkboxsoftware.com  Mon Sep 25 10:26:21 2017
From: mikeowen at thinkboxsoftware.com (Mike Owen)
Date: Mon, 25 Sep 2017 10:26:21 +0100
Subject: [gpfsug-discuss] SPectrum Scale on AWS
In-Reply-To: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <CADFF-zeNCFgnyU3p8kEPeTYLEZyHOsav-2BuWi+J48Qmn3SavQ@mail.gmail.com>

Full PR release below:


https://aws.amazon.com/about-aws/whats-new/2017/09/deploy-ibm-spectrum-scale-on-the-aws-cloud-with-new-quick-start/


Posted On: Sep 13, 2017


This new Quick Start automatically deploys a highly available IBM Spectrum
Scale cluster with replication on the Amazon Web Services (AWS) Cloud, into
a configuration of your choice. (A small cluster can be deployed in about
25 minutes.)


IBM Spectrum Scale is a flexible, software-defined storage solution that
can be deployed as highly available, high-performance file storage. It can
scale in several dimensions, including performance (bandwidth and IOPS),
capacity, and number of nodes that can mount the file system. The product?s
high performance and scalability helps address the needs of applications
whose performance (or performance-to-capacity ratio) demands cannot be met
by traditional scale-up storage systems. The IBM Spectrum Scale software is
being made available through a 90-day trial license evaluation program.


This Quick Start automates the deployment of IBM Spectrum Scale on AWS for
users who require highly available access to a shared name space across
multiple instances with good performance, without requiring an in-depth
knowledge of IBM Spectrum Scale.


The Quick Start deploys IBM Network Shared Disk (NSD) storage server
instances and IBM Spectrum Scale compute instances into a virtual private
cloud (VPC) in your AWS account. Data and metadata elements are replicated
across two Availability Zones for optimal data protection. You can build a
new VPC for IBM Spectrum Scale, or deploy the software into your existing
VPC. The automated deployment provisions the IBM Spectrum Scale instances
in Auto Scaling groups for instance scaling and management.


The deployment and configuration tasks are automated by AWS CloudFormation
templates that you can customize during launch. You can also use the
templates as a starting point for your own implementation, by downloading
them from the GitHub repository
<https://github.com/aws-quickstart/quickstart-ibm-spectrum-scale>. The
Quick Start includes a guide with step-by-step deployment and configuration
instructions.


To get started with IBM Spectrum Scale on AWS, use the following resources:

   - View the architecture and details
   <https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale/>
   - View the deployment guide
   <https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf>
   - Browse and launch other AWS Quick Start reference deployments
   <https://aws.amazon.com/quickstart/>


On 25 September 2017 at 08:40, John Hearns <john.hearns at asml.com> wrote:

> I guess this is not news on this list, however I did see a reference to
> SpectrumScale  on The Register this morning,
>
> which linked to this paper:
>
> https://s3.amazonaws.com/quickstart-reference/ibm/
> spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf
>
>
>
> The article is here https://www.theregister.co.uk/
> 2017/09/25/storage_super_club_sandwich/
>
> 12 Terabyte Helium drives now available.
>
>
>
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. Neither the sender nor the
> company/group of companies he or she represents shall be liable for the
> proper and complete transmission of the information contained in this
> communication, or for any delay in its receipt.
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/6f7899e7/attachment.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 25 12:42:15 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 25 Sep 2017 11:42:15 +0000
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
Message-ID: <018DE6B7-ADE3-4A01-B23C-9DB668FD95DB@nuance.com>

Another data point for Keith/Kristy,

I?ve been using Zimon for about 18 months now, and I?ll have to admit it?s been less than robust for long-term data. The biggest issue I?ve run into is the stability of the collector process. I have it crash on a fairly regular basis, most due to memory usage. This results in data loss You can configure it in a highly-available mode that should mitigate this to some degree. However, I don?t think IBM has published any details on how reliable the data collection process is.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Kristy Kallback-Rose <kkr at lbl.gov>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Sunday, September 24, 2017 at 2:29 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] Experience with zimon database stability, and best practices for backup?

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I have been thinking about. I'm interested partly because of the granularity that is lost over time like with any round robin style data collection scheme.

So I guess one question is whether you have considered pulling the data out to another database, looked at the SS GUI which uses a postgres db (iirc, about to take off on a flight and can't check), or looked at the Grafana bridge which would get data into OpenTsdb format, again iirc. Anyway, just some things for consideration and a request to share back whatever you find out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com<mailto:bipcuds at gmail.com>> wrote:
Hello All,
In a recent Spectrum Scale performance study, we used zimon/mmperfmon to gather metrics. During a period of 2 months, we ended up losing data twice from the zimon database; once after the virtual disk serving both the OS files and zimon collector and DB storage was resized, and a second time after an unknown event (the loss was discovered when plotting in Grafana only went back to a certain data and time; likewise, mmperfmon query output only went back to the same time).
Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains" settings were as follows (note that we did not hit the ceiling of 60GB (1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10 seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10 seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}

Questions:
1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing the aggregation domain definitions, or even simply restarting the zimon collector?

3.) Does anyone have any "best practices" for backing up the zimon database? We were taking weekly "snapshots" by shutting down the collector, and making a tarball copy of the /opt/ibm/zimon directory (but the database corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the suspected dates of data loss; we do have a gpfs.snap from about a month after the last data loss - would it have any useful clues? Opening a PMR could be tricky, as it was the customer who has the support entitlement, and the environment (specifically the old cluster definitino and the zimon collector VM) was torn down.


Many Thanks,
  Keith

--
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.redlineperf.com_&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=dUvbBoiPFANvyGsOER5MAnt9-mwK69adFuLFatx2Rmw&e=>
email: kball at redlineperf.com<mailto:aqualkenbush at redlineperf.com>
cell: 540-557-7851<tel:%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=d6CkXN5mbyGvJQOduzX-LhJMANQgfvAV-nw_6ZgG-D4&e=>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=LkO3HEtokkzigjYqB4dIOUWLPhtikMbwcsXEakFp8DU&e=>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/27ae52b4/attachment.htm>

From r.sobey at imperial.ac.uk  Mon Sep 25 15:35:33 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 25 Sep 2017 14:35:33 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Message-ID: <1506350132.352.17.camel@imperial.ac.uk>

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL

[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/38f28726/attachment.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment-0001.htm>

From christof.schmitt at us.ibm.com  Tue Sep 26 21:49:09 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Tue, 26 Sep 2017 20:49:09 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/b338de75/attachment.htm>

From r.sobey at imperial.ac.uk  Wed Sep 27 09:02:51 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 27 Sep 2017 08:02:51 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DB2FC64B573BD26AAAE1DF780@HE1PR0602MB3225.eurprd06.prod.outlook.com>

I?m sorry, you?re right. I can only assume my brain was looking for an SID entry so when I saw Everyone:ALLOWED/FULL it didn?t process it at all.

4.2.3-4:
[root at cesnode ~]# mmsmb exportacl list

[testces]
ACL:\Everyone:ALLOWED/FULL

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 26 September 2017 21:49
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

The default for the "export ACL" is always to allow access to "Everyone", so that the the "export ACL" does not limit access by default, but only the file system ACL. I do not have systems with these code levels at hand, could you show the difference you see between PTF2 and PTF4?

Regards,

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Tue, Sep 26, 2017 2:59 AM


There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.


Thanks

Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


Hi Christof,  thanks I?ll try it on a test cluster.


Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM


Hi all,


This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.


[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids


[neuroscience2]

REVISION:1

ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0

mmsmb exportacl remove: Incorrect option: --sid

Usage:

mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]

      where:

Access is one of ALLOWED, DENIED

      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO


I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.


Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.


[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL

[E] The mmsmb exportacl remove command requires a Name or SID.


Can anyone see my mistake?


Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=B-AqKIRCmLBzoWAhGn7NY-ZASOX25NuP_c_ndE8gy4A&s=S06OD3mbRedYjfwETO8tUnlOjnWT7pOX8nsYX5ebIdA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/b11f5615/attachment.htm>

From kenneth.waegeman at ugent.be  Wed Sep 27 09:16:49 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 27 Sep 2017 10:16:49 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
Message-ID: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth


From michael.holliday at crick.ac.uk  Wed Sep 27 09:25:58 2017
From: michael.holliday at crick.ac.uk (Michael Holliday)
Date: Wed, 27 Sep 2017 08:25:58 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
Message-ID: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>

Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/26da6e56/attachment.htm>

From bbanister at jumptrading.com  Wed Sep 27 14:59:08 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 27 Sep 2017 13:59:08 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
In-Reply-To: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
References: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
Message-ID: <c87527d09e0644c4958cd4df7f3598f4@jumptrading.com>

Actually you will get a benefit in that you can set up a callback so that users get alerted when they got over a soft quota.

We also set up a fileset quota so that the callback will automatically notify users when they exceed their block and file quotas for their fileset as well.

Hope that helps,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Michael Holliday
Sent: Wednesday, September 27, 2017 4:26 AM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] File Quotas vs Inode Limits

Note: External Email
________________________________
Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/e9847c88/attachment.htm>

From Greg.Lehmann at csiro.au  Thu Sep 28 00:44:53 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Wed, 27 Sep 2017 23:44:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
Message-ID: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From bbanister at jumptrading.com  Thu Sep 28 14:21:34 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Thu, 28 Sep 2017 13:21:34 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
Message-ID: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From JRLang at uwyo.edu  Thu Sep 28 15:18:52 2017
From: JRLang at uwyo.edu (Jeffrey R. Lang)
Date: Thu, 28 Sep 2017 14:18:52 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
Message-ID: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>

I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.

cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
kdump-kern.o: In function `GetOffset':
kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
kdump-kern.o: In function `KernInit':
kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
collect2: error: ld returned 1 exit status
make[1]: *** [modules] Error 1
make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
make: *** [Modules] Error 1
--------------------------------------------------------
mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
--------------------------------------------------------
mmbuildgpl: Command failed. Examine previous error messages to determine cause.
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# uname -a
Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root at bkupsvr3 ~]# mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.2.3 ".
Built on Mar 16 2017 at 11:19:59

In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2

If I'm missing something can some one point me in the right direction?


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
Sent: Thursday, September 28, 2017 8:22 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From xhejtman at ics.muni.cz  Thu Sep 28 15:22:54 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 28 Sep 2017 16:22:54 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <20170928142254.xwjvp3qwnilazer7@ics.muni.cz>

You need 4.2.3.4 GPFS version and it will work.

On Thu, Sep 28, 2017 at 02:18:52PM +0000, Jeffrey R. Lang wrote:
> I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.
> 
> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
> kdump-kern.o: In function `GetOffset':
> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
> kdump-kern.o: In function `KernInit':
> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
> collect2: error: ld returned 1 exit status
> make[1]: *** [modules] Error 1
> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
> make: *** [Modules] Error 1
> --------------------------------------------------------
> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
> --------------------------------------------------------
> mmbuildgpl: Command failed. Examine previous error messages to determine cause.
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# uname -a
> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> [root at bkupsvr3 ~]# mmdiag --version
> 
> === mmdiag: version ===
> Current GPFS build: "4.2.2.3 ".
> Built on Mar 16 2017 at 11:19:59
> 
> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2
> 
> If I'm missing something can some one point me in the right direction?
> 
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
> Sent: Thursday, September 28, 2017 8:22 AM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Please review this site:
> 
> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
> 
> Hope that helps,
> -Bryan
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
> Sent: Wednesday, September 27, 2017 6:45 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Note: External Email
> -------------------------------------------------
> 
> I guess I may as well ask about SLES 12 SP3 as well! TIA.
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
> Sent: Wednesday, 27 September 2017 6:17 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: [gpfsug-discuss] el7.4 compatibility
> 
> Hi,
> 
> Is there already some information available of gpfs (and protocols) on
> el7.4 ?
> 
> Thanks!
> 
> Kenneth
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> ________________________________
> 
> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-- 
Luk?? Hejtm?nek


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:23:53 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:23:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>

The 7.4 kernels are listed as having been tested by IBM.

Having said that, we have clients running 7.4 kernel and its OK, but we
are 4.2.3.4efix2, so bump versions...

Simon

On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
JRLang at uwyo.edu> wrote:

>I just tired to build the GPFS GPL module against the latest version of
>RHEL 7.4 kernel and the build fails.  The link below show that it should
>work.
>
>cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>kdump-kern.o: In function `GetOffset':
>kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>kdump-kern.o: In function `KernInit':
>kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>collect2: error: ld returned 1 exit status
>make[1]: *** [modules] Error 1
>make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>make: *** [Modules] Error 1
>--------------------------------------------------------
>mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>--------------------------------------------------------
>mmbuildgpl: Command failed. Examine previous error messages to determine
>cause.
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]# uname -a
>Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>[root at bkupsvr3 ~]# mmdiag --version
>
>=== mmdiag: version ===
>Current GPFS build: "4.2.2.3 ".
>Built on Mar 16 2017 at 11:19:59
>
>In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>case 514.26.2
>
>If I'm missing something can some one point me in the right direction?
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>Banister
>Sent: Thursday, September 28, 2017 8:22 AM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Please review this site:
>
>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>
>Hope that helps,
>-Bryan
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>Greg.Lehmann at csiro.au
>Sent: Wednesday, September 27, 2017 6:45 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Note: External Email
>-------------------------------------------------
>
>I guess I may as well ask about SLES 12 SP3 as well! TIA.
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>Waegeman
>Sent: Wednesday, 27 September 2017 6:17 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: [gpfsug-discuss] el7.4 compatibility
>
>Hi,
>
>Is there already some information available of gpfs (and protocols) on
>el7.4 ?
>
>Thanks!
>
>Kenneth
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>________________________________
>
>Note: This email is for the confidential use of the named addressee(s)
>only and may contain proprietary, confidential or privileged information.
>If you are not the intended recipient, you are hereby notified that any
>review, dissemination or copying of this email is strictly prohibited,
>and to please notify the sender immediately and destroy this email and
>any attachments. Email transmission cannot be guaranteed to be secure or
>error-free. The Company, therefore, does not make any guarantees as to
>the completeness or accuracy of this email or any attachments. This email
>is for informational purposes only and does not constitute a
>recommendation, offer, request or solicitation of any kind to buy, sell,
>subscribe, redeem or perform any type of transaction of a financial
>product.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From kenneth.waegeman at ugent.be  Thu Sep 28 15:36:04 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Thu, 28 Sep 2017 16:36:04 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
Message-ID: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>


On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
> The 7.4 kernels are listed as having been tested by IBM.
Hi,

Were did you find this?
>
> Having said that, we have clients running 7.4 kernel and its OK, but we
> are 4.2.3.4efix2, so bump versions...
Do you have some information about the efix2? Is this for 7.4 ? And 
where should we find this :-)

Thank you!

Kenneth

>
> Simon
>
> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
> JRLang at uwyo.edu> wrote:
>
>> I just tired to build the GPFS GPL module against the latest version of
>> RHEL 7.4 kernel and the build fails.  The link below show that it should
>> work.
>>
>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>> kdump-kern.o: In function `GetOffset':
>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>> kdump-kern.o: In function `KernInit':
>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>> collect2: error: ld returned 1 exit status
>> make[1]: *** [modules] Error 1
>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>> make: *** [Modules] Error 1
>> --------------------------------------------------------
>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>> --------------------------------------------------------
>> mmbuildgpl: Command failed. Examine previous error messages to determine
>> cause.
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]# uname -a
>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>> [root at bkupsvr3 ~]# mmdiag --version
>>
>> === mmdiag: version ===
>> Current GPFS build: "4.2.2.3 ".
>> Built on Mar 16 2017 at 11:19:59
>>
>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>> case 514.26.2
>>
>> If I'm missing something can some one point me in the right direction?
>>
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>> Banister
>> Sent: Thursday, September 28, 2017 8:22 AM
>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Please review this site:
>>
>> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>>
>> Hope that helps,
>> -Bryan
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>> Greg.Lehmann at csiro.au
>> Sent: Wednesday, September 27, 2017 6:45 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Note: External Email
>> -------------------------------------------------
>>
>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>> Waegeman
>> Sent: Wednesday, 27 September 2017 6:17 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: [gpfsug-discuss] el7.4 compatibility
>>
>> Hi,
>>
>> Is there already some information available of gpfs (and protocols) on
>> el7.4 ?
>>
>> Thanks!
>>
>> Kenneth
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>> ________________________________
>>
>> Note: This email is for the confidential use of the named addressee(s)
>> only and may contain proprietary, confidential or privileged information.
>> If you are not the intended recipient, you are hereby notified that any
>> review, dissemination or copying of this email is strictly prohibited,
>> and to please notify the sender immediately and destroy this email and
>> any attachments. Email transmission cannot be guaranteed to be secure or
>> error-free. The Company, therefore, does not make any guarantees as to
>> the completeness or accuracy of this email or any attachments. This email
>> is for informational purposes only and does not constitute a
>> recommendation, offer, request or solicitation of any kind to buy, sell,
>> subscribe, redeem or perform any type of transaction of a financial
>> product.
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:45:25 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:45:25 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
Message-ID: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but we
>> are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf
>> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf
>>of
>> JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version of
>>> RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should
>>> work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine
>>> cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>>> case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>> 
>>>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.ht
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>>> Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named addressee(s)
>>> only and may contain proprietary, confidential or privileged
>>>information.
>>> If you are not the intended recipient, you are hereby notified that any
>>> review, dissemination or copying of this email is strictly prohibited,
>>> and to please notify the sender immediately and destroy this email and
>>> any attachments. Email transmission cannot be guaranteed to be secure
>>>or
>>> error-free. The Company, therefore, does not make any guarantees as to
>>> the completeness or accuracy of this email or any attachments. This
>>>email
>>> is for informational purposes only and does not constitute a
>>> recommendation, offer, request or solicitation of any kind to buy,
>>>sell,
>>> subscribe, redeem or perform any type of transaction of a financial
>>> product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


From aaron.s.knister at nasa.gov  Fri Sep 29 02:59:39 2017
From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP])
Date: Fri, 29 Sep 2017 01:59:39 +0000
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
Message-ID: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>

Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively, any word on a 4.2.3.5 release date?

-Aaron


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/36aa7795/attachment.htm>

From john.hearns at asml.com  Fri Sep 29 10:02:26 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 29 Sep 2017 09:02:26 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
Message-ID: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols)
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that
>>>any  review, dissemination or copying of this email is strictly
>>>prohibited,  and to please notify the sender immediately and destroy
>>>this email and  any attachments. Email transmission cannot be
>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>not make any guarantees as to  the completeness or accuracy of this
>>>email or any attachments. This email  is for informational purposes
>>>only and does not constitute a  recommendation, offer, request or
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.


From r.sobey at imperial.ac.uk  Fri Sep 29 10:04:49 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 29 Sep 2017 09:04:49 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Efixes (in my one time only limited experience!) come direct from IBM as a result of a PMR. 
Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 29 September 2017 10:02
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but 
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And 
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on 
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version 
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it 
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to 
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat 
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In 
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan 
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) 
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or 
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that 
>>>any  review, dissemination or copying of this email is strictly 
>>>prohibited,  and to please notify the sender immediately and destroy 
>>>this email and  any attachments. Email transmission cannot be 
>>>guaranteed to be secure or  error-free. The Company, therefore, does 
>>>not make any guarantees as to  the completeness or accuracy of this 
>>>email or any attachments. This email  is for informational purposes 
>>>only and does not constitute a  recommendation, offer, request or 
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform 
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Fri Sep 29 10:39:43 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Fri, 29 Sep 2017 09:39:43 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <D5F3D1A8.61653%s.j.thompson@bham.ac.uk>

Correct they some from IBM support.

The AFM issue we have (and is fixed in the efix) is if you have client
code running on the AFM cache that uses truncate. The AFM write coalescing
processing does something funny with it, so the file isn't truncated and
then the data you write afterwards isn't copied back to home.

We found this with ABAQUS code running on our HPC nodes onto the AFM
cache, I.e. At home, the final packed output file from ABAQUS is corrupt
as its the "untruncated and then filled" version of the file (so just a
big blob of empty data). I would guess that anything using truncate would
see the same issue.

4.2.3.x: APAR IV99796

See IBM Flash Alert at:
http://www-01.ibm.com/support/docview.wss?uid=ssg1S1010629&myns=s033&mynp=O
CSTXKQY&mynp=OCSWJ00&mync=E&cm_sp=s033-_-OCSTXKQY-OCSWJ00-_-E


Its remedied in efix2, of course remember that an efix has not gone
through a full testing validation cycle (otherwise it would be a PTF), but
we have not seen any issues in our environments running 4.2.3.4efix2.

Simon

On 29/09/2017, 10:04, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Sobey, Richard A" <gpfsug-discuss-bounces at spectrumscale.org on behalf
of r.sobey at imperial.ac.uk> wrote:

>Efixes (in my one time only limited experience!) come direct from IBM as
>a result of a PMR.
>Richard
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
>Sent: 29 September 2017 10:02
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Simon,
>I would appreciate a heads up on that AFM issue.
>I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is
>if a remote NFS mount goes down then an asynchronous operation such as a
>read can be stopped.
>
>I must admit to being not clued up on how the efixes are distributed. I
>downloaded the 4.2.3.4 installer for Linux yesterday.
>Should I be searching for additional fix packs on top of that (which I am
>in fact doing now).
>
>John H
>
>
>
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon
>Thompson (IT Research Support)
>Sent: Thursday, September 28, 2017 4:45 PM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>
>Aren't listed as tested
>
>Sorry ...
>4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
>issue we have.
>
>Simon
>
>On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
><kenneth.waegeman at ugent.be> wrote:
>
>>
>>
>>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>>> The 7.4 kernels are listed as having been tested by IBM.
>>Hi,
>>
>>Were did you find this?
>>>
>>> Having said that, we have clients running 7.4 kernel and its OK, but
>>> we are 4.2.3.4efix2, so bump versions...
>>Do you have some information about the efix2? Is this for 7.4 ? And
>>where should we find this :-)
>>
>>Thank you!
>>
>>Kenneth
>>
>>>
>>> Simon
>>>
>>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>>on behalf of  JRLang at uwyo.edu> wrote:
>>>
>>>> I just tired to build the GPFS GPL module against the latest version
>>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>>should  work.
>>>>
>>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>>> kdump-kern.o: In function `GetOffset':
>>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>>> kdump-kern.o: In function `KernInit':
>>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>>> collect2: error: ld returned 1 exit status
>>>> make[1]: *** [modules] Error 1
>>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>>> make: *** [Modules] Error 1
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT
>>>>2017.
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>>determine  cause.
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]# uname -a
>>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>>Sep 9
>>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>> [root at bkupsvr3 ~]# mmdiag --version
>>>>
>>>> === mmdiag: version ===
>>>> Current GPFS build: "4.2.2.3 ".
>>>> Built on Mar 16 2017 at 11:19:59
>>>>
>>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>>> my case 514.26.2
>>>>
>>>> If I'm missing something can some one point me in the right direction?
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>>> Banister
>>>> Sent: Thursday, September 28, 2017 8:22 AM
>>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Please review this site:
>>>>
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>>ml
>>>>
>>>> Hope that helps,
>>>> -Bryan
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Greg.Lehmann at csiro.au
>>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Note: External Email
>>>> -------------------------------------------------
>>>>
>>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Kenneth Waegeman
>>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Hi,
>>>>
>>>> Is there already some information available of gpfs (and protocols)
>>>> on
>>>> el7.4 ?
>>>>
>>>> Thanks!
>>>>
>>>> Kenneth
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0 _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> Note: This email is for the confidential use of the named
>>>>addressee(s)  only and may contain proprietary, confidential or
>>>>privileged information.
>>>> If you are not the intended recipient, you are hereby notified that
>>>>any  review, dissemination or copying of this email is strictly
>>>>prohibited,  and to please notify the sender immediately and destroy
>>>>this email and  any attachments. Email transmission cannot be
>>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>>not make any guarantees as to  the completeness or accuracy of this
>>>>email or any attachments. This email  is for informational purposes
>>>>only and does not constitute a  recommendation, offer, request or
>>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>>any type of transaction of a financial  product.
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0  _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>> pw%3D&reserved=0
>>
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.o
>rg%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml
>.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc
>%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
>-- The information contained in this communication and any attachments is
>confidential and may be privileged, and is for the sole use of the
>intended recipient(s). Any unauthorized review, use, disclosure or
>distribution is prohibited. Unless explicitly stated otherwise in the
>body of this communication or the attachment thereto (if any), the
>information is provided on an AS-IS basis without any express or implied
>warranties or liabilities. To the extent you are relying on this
>information, you are doing so at your own risk. If you are not the
>intended recipient, please notify the sender immediately by replying to
>this message and destroy all copies of this message and any attachments.
>Neither the sender nor the company/group of companies he or she
>represents shall be liable for the proper and complete transmission of
>the information contained in this communication, or for any delay in its
>receipt.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From scale at us.ibm.com  Fri Sep 29 13:26:51 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Fri, 29 Sep 2017 07:26:51 -0500
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
In-Reply-To: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
References: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
Message-ID: <OF685C1BCB.06119E7F-ON852581AA.0043AD1E-862581AA.00446050@notes.na.collabserv.com>

There isn't a "recommended" efix as such.  Generally, fixes go into the
next ptf so that they go through a test cycle.  If a customer hits a
serious issue that cannot wait for the next ptf, they can request an efix
be built, but since efixes do not get the same level of rigorous testing as
a ptf, they are not generally recommended unless you report an issue and
service determines you need it.

To address your other questions:
   We are currently up to efix3 on 4.2.3.4
   We don't announce PTF dates, because they depend upon the testing;
   however, you can see that we generally release a PTF roughly every 6
   weeks and I believe ptf4 was out on 8/24

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]"
            <aaron.s.knister at nasa.gov>
To:	"discussion, gpfsug main" <gpfsug-discuss at spectrumscale.org>
Date:	09/28/2017 08:59 PM
Subject:	[gpfsug-discuss] Latest recommended 4.2 efix?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of
some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively,
any word on a 4.2.3.5 release date?

-Aaron

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=IVcYH9EDg-UaA4Jt2GbsxN5XN1XbvejXTX0gAzNxtpM&s=9SmogyyA6QNSWxlZrpE-vBbslts0UexwJwPzp78LgKs&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment.gif>

From sandeep.patil at in.ibm.com  Sat Sep 30 05:02:22 2017
From: sandeep.patil at in.ibm.com (Sandeep Ramesh)
Date: Sat, 30 Sep 2017 09:32:22 +0530
Subject: [gpfsug-discuss] Spectrum Scale Enablement Material - 1H 2017
Message-ID: <OF864F88AC.3E69D527-ON652581AB.00150F44-652581AB.00163072@notes.na.collabserv.com>

Hi Folks

I was asked by Doris Conti to send the below to our Spectrum Scale User 
group.

Below is a consolidated link that list all the enablement on Spectrum 
Scale/ESS that was done in 1H 2017 - which have blogs and videos from 
development and offering management.
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media

Do note, Spectrum Scale developers keep blogging on the below site which 
is worth bookmarking: https://developer.ibm.com/storage/blog/
(as recent as 4 new blogs in Sept)

Thanks
Sandeep 
Linkedin: https://www.linkedin.com/in/sandeeprpatil
Spectrum Scale Dev.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170930/4e06399d/attachment.htm>

From r.sobey at imperial.ac.uk  Fri Sep  1 09:45:24 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 08:45:24 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/12dc1c48/attachment-0001.htm>

From bart.vandamme at sdnsquare.com  Fri Sep  1 10:30:59 2017
From: bart.vandamme at sdnsquare.com (Bart Van Damme)
Date: Fri, 1 Sep 2017 11:30:59 +0200
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
Message-ID: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>

We are a company located in Belgium that mainly implements spectrum scale
clusters in the Media and broadcasting industry.

Currently we have a customer who wants to export the scale file system over
samba 4.5 and 4.6.
In these versions the SMB2 leases are activated by default for enhancing
the oplocks system.

The problem is when this option is not disabled Adobe (and probably
Windows) is not notified the size of the file have changed, resulting that
reading growing file in Adobe is not working, the timeline is not updated.

Does anybody had this issues before and know how to solve it.


This is the smb.conf file:


============================

# Global options

smb2 leases = yes

client use spnego = yes

clustering = yes

unix extensions = no

mangled names = no

ea support = yes

store dos attributes = yes

map readonly = no

map archive = yes

map system = no

force unknown acl user = yes

obey pam restrictions = no

deadtime = 480

disable netbios = yes

server signing = disabled

server min protocol = SMB2

smb encrypt = off

# We do not allow guest usage.

guest ok = no

guest account = nobody

map to guest = bad user

# disable printing

load printers = no

printing = bsd

printcap name = /dev/null

disable spoolss = yes

# log settings

log file = /var/log/samba/log.%m

# max 500KB per log file, then rotate

max log size = 500

log level = 1 passdb:1 auth:1 winbind:1  idmap:1

#============ Share Definitions ============

[pfs]

comment = GPFS

path = /gpfs/pfs

valid users = @ug_numpr

writeable = yes

inherit permissions = yes

create mask = 664

force create mode = 664

nfs4:chown = yes

nfs4:acedup = merge

nfs4:mode = special

fileid:algorithm = fsname

vfs objects = shadow_copy2 gpfs fileid full_audit

full_audit:prefix = %u|%I|%m|%S

full_audit:success = rename unlink rmdir

full_audit:failure = none

full_audit:facility = local6

full_audit:priority = NOTICE

shadow:fixinodes = yes

gpfs:sharemodes = yes

gpfs:winattr = yes

gpfs:leases = no

locking = yes

posix locking = yes

oplocks = yes

kernel oplocks = no


Grtz,

Bart

*Bart Van Damme *

*Customer Project Manager*

*SDNsquare*
Technologiepark 3,
9052 Zwijnaarde, Belgium
www.sdnsquare.com

T:  + 32 9 241 56 01
<09%20241%2056%2001>
M: + 32 496 59 23 09


*This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.*

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virusvrij.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/873bf698/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Fri Sep  1 14:36:56 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 13:36:56 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
In-Reply-To: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225B01281871606C60C6333DF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/2a4162e9/attachment-0001.htm>

From ewahl at osc.edu  Fri Sep  1 21:56:25 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Fri, 1 Sep 2017 16:56:25 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c@osc.edu>

Howdy.   Just noticed this change to min RDMA packet size and I don't seem to
see it in any patch notes.  Maybe I just skipped the one where this changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384 

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix. 
Current GPFS build: "4.2.2.3 efix21 (1028007)".

-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From akers at vt.edu  Fri Sep  1 22:06:15 2017
From: akers at vt.edu (Joshua Akers)
Date: Fri, 01 Sep 2017 21:06:15 +0000
Subject: [gpfsug-discuss] Quorum managers
Message-ID: <CAHO5rBG+PkntpshV105j54+O4CtcDXqQCb9AJutq-s_PEN0g3A@mail.gmail.com>

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
-- 
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/a49947db/attachment-0001.htm>

From oehmes at gmail.com  Fri Sep  1 23:42:55 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 01 Sep 2017 22:42:55 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <20170901165625.6e4edd4c@osc.edu>
References: <20170901165625.6e4edd4c@osc.edu>
Message-ID: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf

hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/b75cfc74/attachment-0001.htm>

From truongv at us.ibm.com  Fri Sep  1 23:56:23 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Fri, 1 Sep 2017 18:56:23 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <OF5FAAEAFD.EA16DDD3-ON8525818E.007DC7B6-8525818E.007E031F@notes.na.collabserv.com>


The discrepancy between the mmlsconfig view and mmdiag has been fixed in
GFPS 4.2.3 version.  Note, mmdiag reports the correct default value.

Tru.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/01/2017 06:43 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 2
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: GPFS GUI Nodes > NSD no data (Sobey, Richard A)
   2. Change to default for verbsRdmaMinBytes? (Edward Wahl)
   3. Quorum managers (Joshua Akers)
   4. Re: Change to default for verbsRdmaMinBytes? (Sven Oehme)


----------------------------------------------------------------------

Message: 1
Date: Fri, 1 Sep 2017 13:36:56 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID:

<HE1PR0602MB3225B01281871606C60C6333DF920 at HE1PR0602MB3225.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey,
Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD
Server Nodes, the only columns with good data are Name, State and NSD
Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a
while, I don't think I've seen it work since 4.2.1 which was the first time
I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_2a4162e9_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=jcPGl5zwtQFMbnEmBpNErsD43uwoVeKgKk_8j7ZeCJY&e=
 >

------------------------------

Message: 2
Date: Fri, 1 Sep 2017 16:56:25 -0400
From: Edward Wahl <ewahl at osc.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c at osc.edu>
Content-Type: text/plain; charset="US-ASCII"

Howdy.   Just noticed this change to min RDMA packet size and I don't seem
to
see it in any patch notes.  Maybe I just skipped the one where this
changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things
before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix.
Current GPFS build: "4.2.2.3 efix21 (1028007)".

--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


------------------------------

Message: 3
Date: Fri, 01 Sep 2017 21:06:15 +0000
From: Joshua Akers <akers at vt.edu>
To: "gpfsug-discuss at spectrumscale.org"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Quorum managers
Message-ID:
		 <CAHO5rBG+PkntpshV105j54
+O4CtcDXqQCb9AJutq-s_PEN0g3A at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
--
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_a49947db_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=Gag0raQbp7KZAyINlnmuxlnpjboo9XOWO3dDL2HCsZo&e=
 >

------------------------------

Message: 4
Date: Fri, 01 Sep 2017 22:42:55 +0000
From: Sven Oehme <oehmes at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID:
		 <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=
+czkxMP3xCMg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
https://urldefense.proofpoint.com/v2/url?u=http-3A__files.gpfsug.org_presentations_2017_Manchester_08-5FResearch-5FTopics.pdf&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=8c_55Ld_iAC2sr_QU0cyGiOiyU7Z9NjcVknVuRpRIlk&e=


hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't
seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_b75cfc74_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=LpVpXMgqE_LD-t_J7yfNwURUrdUR29TzWvjVTi18kpA&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=


End of gpfsug-discuss Digest, Vol 68, Issue 2
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment-0001.gif>

From r.sobey at imperial.ac.uk  Sat Sep  2 10:35:34 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Sat, 2 Sep 2017 09:35:34 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/4f65f336/attachment-0001.htm>

From truongv at us.ibm.com  Sat Sep  2 12:40:15 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Sat, 2 Sep 2017 07:40:15 -0400
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
References: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFD4B893A8.5AF018F4-ON8525818F.003EF746-8525818F.00401C38@notes.na.collabserv.com>


The dates that have the zone abbreviation are from the scripts which use
the OS date command.  The daemon has its own format.  This inconsistency
has been address in 4.2.2.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/02/2017 07:00 AM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 4
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Date formats inconsistent mmfs.log (Sobey, Richard A)


----------------------------------------------------------------------

Message: 1
Date: Sat, 2 Sep 2017 09:35:34 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: "'gpfsug-discuss at spectrumscale.org'"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID:

<VI1PR0602MB32292C575EFD00F708085F67DF930 at VI1PR0602MB3229.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Is there a good reason for the date formats in mmfs.log to be inconsistent?
Apart from my OCD getting the better of me, it makes log analysis a bit
difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script
mmSysMonGpfsStartup: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script
mmSinceShutdownRoleChange: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send
a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is
already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170902_4f65f336_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=fNT71mM8obJ9rwxzm3Uzxw4mayi2pQg1u950E1raYK4&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=


End of gpfsug-discuss Digest, Vol 68, Issue 4
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment-0001.gif>

From john.hearns at asml.com  Mon Sep  4 08:43:59 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 4 Sep 2017 07:43:59 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
Message-ID: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

*        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
*        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/4b60a071/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Mon Sep  4 09:05:10 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 4 Sep 2017 08:05:10 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>,
	<HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322515F17C33CEF7FCBFDB00DF910@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Ah. I'm running 4.2.3 but haven't changed the release level. I'll get that sorted out.

Thanks for the replies!

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of John Hearns <john.hearns at asml.com>
Sent: Monday, September 4, 2017 8:43:59 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Date formats inconsistent mmfs.log

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

?        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
?        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/22e41b16/attachment-0001.htm>

From ckrafft at de.ibm.com  Mon Sep  4 13:02:49 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Mon, 4 Sep 2017 12:02:49 +0000
Subject: [gpfsug-discuss] Looking for Use-Cases with Spectrum Scale / ESS
	with vRanger & VMware
Message-ID: <OF68690758.28E4DC1B-ON00258191.00422D1C-00258191.00422D1F@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/fc167793/attachment-0001.htm>

From heiner.billich at psi.ch  Mon Sep  4 17:48:20 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Mon, 4 Sep 2017 16:48:20 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>

Hello,

We use AFM prefetch to migrate data between two clusters (using NFS). This works fine with large files, say 1+GB. But we have millions of smaller files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 1000+MB/s we get for larger files.

I assume that we would need more parallelism, does prefetch pull just one file at a time? So each file needs  some or many metadata operations plus a single  or just a few read and writes. Doing this sequentially adds up all the latencies of NFS+GPFS. This is my explanation. With larger files gpfs prefetch on home will help.

Please can anybody comment: Is this right, does AFM prefetch handle one file at a time in a sequential manner? And is there any way to change this behavior? Or am I wrong and I need to look elsewhere to get better performance for prefetch of many smaller files?

We will migrate several filesets in parallel, but still with individual filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 files/s seconds looks poor.

The setup is quite new, hence there may be other places to look at. 
It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.

Thank you,

Heiner
--,
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 
    
From vpuvvada at in.ibm.com  Tue Sep  5 15:27:21 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Tue, 5 Sep 2017 19:57:21 +0530
Subject: [gpfsug-discuss] Use AFM for migration of many small files
In-Reply-To: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
References: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
Message-ID: <OF97D680D8.B6748981-ON65258192.004D7F18-65258192.004F6849@notes.na.collabserv.com>

Which version of Spectrum Scale ? What is the fileset mode ?

>We use AFM prefetch to migrate data between two clusters (using NFS). 
This works fine with large files, say 1+GB. But we have millions of 
smaller files,  about 1MB each. Here >I see just ~150MB/s ? compare this 
to the 1000+MB/s we get for larger files.

How was the performance measured ? If parallel IO is enabled, AFM uses 
multiple gateway nodes to prefetch the large files (if file size if more 
than 1GB). Performance difference between small and lager file is huge 
(1000MB - 150MB = 850MB) here, and generally it is not the case. How many 
files were present in list file for prefetch ? Could you also share full 
internaldump from the gateway node ? 

>I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few >read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.

AFM prefetches the files on multiple threads. Default flush threads for 
prefetch are 36 (fileset.afmNumFlushThreads (default 4) + 
afmNumIOFlushThreads (default 32)). 

>Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and >I need to look elsewhere to get better 
performance for prefetch of many smaller files?

See above, AFM reads files on multiple threads parallelly.  Try increasing 
the afmNumFlushThreads on fileset and verify if it improves the 
performance.

~Venkat (vpuvvada at in.ibm.com)


From:   "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch>
To:     "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
Date:   09/04/2017 10:18 PM
Subject:        [gpfsug-discuss] Use AFM for migration of many small files
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hello,


We use AFM prefetch to migrate data between two clusters (using NFS). This 
works fine with large files, say 1+GB. But we have millions of smaller 
files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 
1000+MB/s we get for larger files.


I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.


Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and I need to look elsewhere to get better 
performance for prefetch of many smaller files?


We will migrate several filesets in parallel, but still with individual 
filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 
files/s seconds looks poor.


The setup is quite new, hence there may be other places to look at. 

It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.


Thank you,


Heiner

--,

Paul Scherrer Institut

Science IT

Heiner Billich

WHGA 106

CH 5232  Villigen PSI

056 310 36 02

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.psi.ch&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=eHcVdovN10-m-Qk0Ln2qvol3pkKNFwrzz2wgf1zXVXE&e= 


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=LbRyuSM_djs0FDXr27hPottQHAn3OGcivpyRcIDBN3U&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170905/3b28f7f8/attachment-0001.htm>

From kenneth.waegeman at ugent.be  Wed Sep  6 12:55:20 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 6 Sep 2017 13:55:20 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
Message-ID: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>

Hi Sven,

I see two parameters that we have set to non-default values that are not 
in your list of options still to configure.

verbsRdmasPerConnection (256) and
socketMaxListenConnections (1024)

I remember we had to set socketMaxListenConnections because our cluster 
consist of +550 nodes.

Are these settings still needed, or is this also tackled in the code?

Thank you!!

Cheers,
Kenneth


On 02/09/17 00:42, Sven Oehme wrote:
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not 
> overridden the default settings. the reason we did this was that many 
> systems in the field including all ESS systems that come pre-tuned 
> where manually changed to 8k from the 16k default due to better 
> performance that was confirmed in multiple customer engagements and 
> tests with various settings , therefore we change the default to what 
> it should be in the field so people are not bothered to set it anymore 
> (simplification) or get benefits by changing the default to provides 
> better performance.
> all this happened when we did the communication code overhaul that did 
> lead to significant (think factors) of improved RPC performance for 
> RDMA and VERBS workloads.
> there is another round of significant enhancements coming soon , that 
> will make even more parameters either obsolete or change some of the 
> defaults for better out of the box performance.
> i see that we should probably enhance the communication of this 
> changes, not that i think this will have any negative effect compared 
> to what your performance was with the old setting i am actually pretty 
> confident that you get better performance with the new code, but by 
> setting parameters back to default on most 'manual tuned' probably 
> makes your system even faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have 
> anything set beside maxfilestocache, pagepool, workerthreads and 
> potential prefetch , if you are a protocol node, this and settings 
> specific to an  export (e.g. SMB, NFS set some special settings) , 
> pretty much everything else these days should be set to default so the 
> code can pick the correct parameters., if its not and you get better 
> performance by manual tweaking something i like to hear about it.
> on the communication side in the next release will eliminate another 
> set of parameters that are now 'auto set' and we plan to work on NSD 
> next.
> i presented various slides about the communication and simplicity 
> changes in various forums, latest public non NDA slides i presented 
> are here --> 
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu 
> <mailto:ewahl at osc.edu>> wrote:
>
>     Howdy.  Just noticed this change to min RDMA packet size and I
>     don't seem to
>     see it in any patch notes.  Maybe I just skipped the one where
>     this changed?
>
>      mmlsconfig verbsRdmaMinBytes
>     verbsRdmaMinBytes 16384
>
>     (in case someone thinks we changed it)
>
>     [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>     verbsRdma enable
>     verbsRdma disable
>     verbsRdmasPerConnection 14
>     verbsRdmasPerNode 1024
>     verbsPorts mlx5_3/1
>     verbsPorts mlx4_0
>     verbsPorts mlx5_0
>     verbsPorts mlx5_0 mlx5_1
>     verbsPorts mlx4_1/1
>     verbsPorts mlx4_1/2
>
>
>     Oddly I also see this in config, though I've seen these kinds of
>     things before.
>     mmdiag --config |grep verbsRdmaMinBytes
>        verbsRdmaMinBytes 8192
>
>     We're on a recent efix.
>     Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
>     --
>
>     Ed Wahl
>     Ohio Supercomputer Center
>     614-292-9302 <tel:%28614%29%20292-9302>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c0192e11/attachment-0001.htm>

From olaf.weiser at de.ibm.com  Wed Sep  6 13:22:41 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 6 Sep 2017 14:22:41 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu><CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <OF3CD1E848.DCC8DD84-ONC1258193.0043A0CB-C1258193.0043FEC5@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e8bdd0b1/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Wed Sep  6 13:29:44 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 6 Sep 2017 12:29:44 +0000
Subject: [gpfsug-discuss] Save the date! GPFS-UG meeting at SC17 - Sunday
	November 12th
Message-ID: <7838054B-8A46-46A0-8A53-81E3049B4AE7@nuance.com>

The 2017 Supercomputing conference is only 2 months away, and here?s a reminder to come early and attend the GPFS user group meeting. The meeting is tentatively scheduled from the afternoon of Sunday, November 12th. Exact location and times are still being discussed.

If you have an interest in presenting at the user group meeting, please let us know.

More details in the coming weeks.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/171593af/attachment-0001.htm>

From damir.krstic at gmail.com  Wed Sep  6 13:35:45 2017
From: damir.krstic at gmail.com (Damir Krstic)
Date: Wed, 06 Sep 2017 12:35:45 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
Message-ID: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to the
/projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/7a6df4dd/attachment-0001.htm>

From S.J.Thompson at bham.ac.uk  Wed Sep  6 13:43:09 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Wed, 6 Sep 2017 12:43:09 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <D5D5ABA0.51CF3%s.j.thompson@bham.ac.uk>

Filesets in filesets are fine. BUT if you use scoped backups with TSM... Er Spectrum Protect, then there are restrictions on creating an IFS inside an IFS ...

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>" <damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>>
Reply-To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, 6 September 2017 at 13:35
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [gpfsug-discuss] filesets inside of filesets

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports that. Otherwise, I was going to create a separate filesystem, create corresponding backup filesets on it and symlink them to the /projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/f4a93f39/attachment-0001.htm>

From rohwedder at de.ibm.com  Wed Sep  6 13:51:47 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Wed, 6 Sep 2017 14:51:47 +0200
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <OFD591AA9A.36197B9E-ON00258193.0045ED39-C1258193.0046A8CE@notes.na.collabserv.com>


Hello Damir,

the files that belong to your fileset "backup" has a separate quota, it is
not related to the quota in "b1000".
There is no cumulative quota.

Fileset Nesting may need other considerations as well, in some cases
filesets behave different than simple directories.
-> For NFSV4 ACLs, inheritance stops at the fileset boundaries
-> Snapshots include the independent parent and the dependent children.
Nested independent filesets are not included in a fileset snapshot.
-> Export protocols like NFS or SMB will cross fileset boundaries and just
treat them like a directory.

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	Damir Krstic <damir.krstic at gmail.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/06/2017 02:36 PM
Subject:	[gpfsug-discuss] filesets inside of filesets
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to
the /projects/<projectname> directory.

Thanks in advance.

Damir_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=5jyA3TazAAOckIeQUeIG0CJ4TG0aMWv7jDLDk3gYNkE&s=CbzPKTgh7mO6om2LTQr94LM1qfshrEdm58cJydejAfE&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0003.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1B378274.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0005.gif>

From oehmes at gmail.com  Wed Sep  6 14:32:40 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Wed, 06 Sep 2017 13:32:40 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <CALssuR3ocqZ624YLfFXPYdqgJKqCPdCndvJpUURXj-_=-KKY-w@mail.gmail.com>

Hi,

you still need both of them, but they are both on the list to be removed,
the first is already integrated for the next major release, the 2nd we
still work on.

Sven

On Wed, Sep 6, 2017 at 4:55 AM Kenneth Waegeman <kenneth.waegeman at ugent.be>
wrote:

> Hi Sven,
>
> I see two parameters that we have set to non-default values that are not
> in your list of options still to configure.
> verbsRdmasPerConnection (256) and
> socketMaxListenConnections (1024)
>
> I remember we had to set socketMaxListenConnections because our cluster
> consist of +550 nodes.
>
> Are these settings still needed, or is this also tackled in the code?
>
> Thank you!!
>
> Cheers,
> Kenneth
>
>
>
> On 02/09/17 00:42, Sven Oehme wrote:
>
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not
> overridden the default settings. the reason we did this was that many
> systems in the field including all ESS systems that come pre-tuned where
> manually changed to 8k from the 16k default due to better performance that
> was confirmed in multiple customer engagements and tests with various
> settings , therefore we change the default to what it should be in the
> field so people are not bothered to set it anymore (simplification) or get
> benefits by changing the default to provides better performance.
> all this happened when we did the communication code overhaul that did
> lead to significant (think factors) of improved RPC performance for RDMA
> and VERBS workloads.
> there is another round of significant enhancements coming soon , that will
> make even more parameters either obsolete or change some of the defaults
> for better out of the box performance.
> i see that we should probably enhance the communication of this changes,
> not that i think this will have any negative effect compared to what your
> performance was with the old setting i am actually pretty confident that
> you get better performance with the new code, but by setting parameters
> back to default on most 'manual tuned' probably makes your system even
> faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have anything
> set beside maxfilestocache, pagepool, workerthreads and potential prefetch
> , if you are a protocol node, this and settings specific to an  export
> (e.g. SMB, NFS set some special settings) , pretty much everything else
> these days should be set to default so the code can pick the correct
> parameters., if its not and you get better performance by manual tweaking
> something i like to hear about it.
> on the communication side in the next release will eliminate another set
> of parameters that are now 'auto set' and we plan to work on NSD next.
> i presented various slides about the communication and simplicity changes
> in various forums, latest public non NDA slides i presented are here -->
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl < <ewahl at osc.edu>ewahl at osc.edu>
> wrote:
>
>> Howdy.   Just noticed this change to min RDMA packet size and I don't
>> seem to
>> see it in any patch notes.  Maybe I just skipped the one where this
>> changed?
>>
>>  mmlsconfig verbsRdmaMinBytes
>> verbsRdmaMinBytes 16384
>>
>> (in case someone thinks we changed it)
>>
>> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>> verbsRdma enable
>> verbsRdma disable
>> verbsRdmasPerConnection 14
>> verbsRdmasPerNode 1024
>> verbsPorts mlx5_3/1
>> verbsPorts mlx4_0
>> verbsPorts mlx5_0
>> verbsPorts mlx5_0 mlx5_1
>> verbsPorts mlx4_1/1
>> verbsPorts mlx4_1/2
>>
>>
>> Oddly I also see this in config, though I've seen these kinds of things
>> before.
>> mmdiag --config |grep verbsRdmaMinBytes
>>    verbsRdmaMinBytes 8192
>>
>> We're on a recent efix.
>> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>>
>> --
>>
>> Ed Wahl
>> Ohio Supercomputer Center
>> 614-292-9302 <%28614%29%20292-9302>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e1f559d1/attachment-0001.htm>

From heiner.billich at psi.ch  Wed Sep  6 17:16:18 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Wed, 6 Sep 2017 16:16:18 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <7D6EFD03-5D74-4A7B-A0E8-2AD41B050E15@psi.ch>

Hello Venkateswara, Edward,

Thank you for the comments on how to speed up AFM prefetch with small files. We run 4.2.2-3 and the AFM mode is RO and we have just a single gateway, i.e. no parallel reads for large files. We will try to increase the value of afmNumFlushThreads. It wasn?t clear to me that these threads do read from home, too - at least for prefetch. First I will try a plain NFS mount and see how parallel reads of many small files  scale the throughput. Next I will try AFM prefetch. I don?t do nice benchmarking, just watching dstat output. We prefetch 100?000 files in one bunch, so there is ample time to observe. 

The basic issue is that we get just about 45MB/s for sequential read of  many 1000 files with 1MB per file on the home cluster. I.e. we read one file at a time before we switch to the next. This is no surprise. Each read takes about 20ms to complete, so at max we get 50 reads of 1MB per second. We?ve seen this on classical raid storage and on DSS/ESS systems. It?s likely just the physics of spinning disks and the fact that we do one read at a time and don?t allow any parallelism. We wait for one or two I/Os to single disks to complete before we continue  With larger files prefetch jumps in and fires many reads in parallel ? To get 1?000MB/s I need to do 1?000 read/s  and need to have ~20 reads in progress in parallel  all the time ? we?ll see how close we get to 1?000MB/s with ?many small files?.

Kind regards,

Heiner
--
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 

From stijn.deweirdt at ugent.be  Wed Sep  6 18:13:48 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 6 Sep 2017 19:13:48 +0200
Subject: [gpfsug-discuss] mixed verbsRdmaSend
Message-ID: <f2598b5c-f7c5-48f5-45ad-862976d97be5@ugent.be>

hi all,

what is the expected behaviour of a mixed verbsRdmaSend setup: some
nodes enabled, most disabled.

we have some nodes that have a very high iops workload, but most of the
cluster of 500+ nodes do not have such usecase.
we enabled verbsRdmaSend on the managers/quorum nodes (<10) and on the
few (<10) clients with this workload, but not on the others (500+). it
seems to work out fine, but is this acceptable as config? (the docs
mention that enabling verbsrdamSend on a> 100 nodes might lead to errors).


the nodes use ipoib as ip network, and running with verbsRdmaSend
disabled on all nodes leads to unstable cluster (TX errors (<1 error in
1M packets) on some clients leading to gpfs expel nodes etc).
(we still need to open a case wil mellanox to investigate further)

many thanks,

stijn


From gcorneau at us.ibm.com  Thu Sep  7 00:30:23 2017
From: gcorneau at us.ibm.com (Glen Corneau)
Date: Wed, 6 Sep 2017 18:30:23 -0500
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
Message-ID: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>

Sorry I missed the anniversary of your conception  (announcement letter) 
back on August 26th, so I hope you'll accept my belated congratulations on 
this long and exciting journey!

https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318

I remember your parent, PIOFS, as well!  Ahh the fun times.
------------------
Glen Corneau
Power Systems
Washington Systems Center
gcorneau at us.ibm.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 26117 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment-0001.jpe>

From xhejtman at ics.muni.cz  Thu Sep  7 16:07:20 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 7 Sep 2017 17:07:20 +0200
Subject: [gpfsug-discuss] Overwritting migrated files
Message-ID: <20170907150720.h3t5fowvdlibvik4@ics.muni.cz>

Hello,

we have files about 100GB per file. Many of these files are migrated to tapes.
(GPFS+TSM, tape storage is external pool and dsmmigrate, dsmrecall are in
place).

These files are images from bacula backup system. When bacula wants to reuse
some of images, it needs to truncate the file to 64kB and overwrite it.

Is there a way not to recall whole 100GB from tapes for only to truncate the
file?

I tried to do partial recall:
dsmrecall -D -size=65k Vol03797

after recall processing finished, I tried to truncate the file using:
dd if=/dev/zero of=Vol03797 count=0 bs=64k seek=1

which caused futher recall of the whole file:

$ dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:01:59
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
107380819676     10485760     31373312   m (p)  Vol03797

and ResB size has been growing to 107380819676.

After dd finished:

dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:08:03
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
       65536        65536           64   r      Vol03797


Is there another way to truncate the file and drop whole migrated part?

-- 
Luk?? Hejtm?nek


From john.hearns at asml.com  Thu Sep  7 16:15:00 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:15:00 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
Message-ID: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/a7b29e8b/attachment-0001.htm>

From neil.wilson at metoffice.gov.uk  Thu Sep  7 16:33:58 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Thu, 7 Sep 2017 15:33:58 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

I think you need to configure a gateway node (use mmchnode to change an existing node class to gateway)
Then use mmafmconfig to setup export server maps on the gateway node.

e.g.
mmafmconfig -add "mapping1" -export-map "nfsServerIP"/"GatewayNode"  (double quotes not required)

mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/001ddb64/attachment-0001.htm>

From john.hearns at asml.com  Thu Sep  7 16:52:19 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:52:19 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
Message-ID: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>

Firmly lining myself up for a smack round the chops with a wet haddock...
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/292c9f2c/attachment-0001.htm>

From janusz.malka at desy.de  Thu Sep  7 20:23:36 2017
From: janusz.malka at desy.de (Malka, Janusz)
Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>

I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com> 
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
Sent: Thursday, 7 September, 2017 17:52:19 
Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 


Firmly lining myself up for a smack round the chops with a wet haddock? 

I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 

Mmdelfileset responds that : 

Fileset obfuscated has 1 fileset snapshot(s). 


When I try to delete the snapshot: 

Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 


I find this reference, which is about as useful as a wet haddock: 

[ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 


The advice of the gallery is sought, please. 


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
_______________________________________________ 
gpfsug-discuss mailing list 
gpfsug-discuss at spectrumscale.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/6da4c433/attachment-0001.htm>

From christof.schmitt at us.ibm.com  Thu Sep  7 22:16:34 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Thu, 7 Sep 2017 21:16:34 +0000
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
In-Reply-To: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
References: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
Message-ID: <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/04778952/attachment-0001.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 03:11:48 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 22:11:48 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
Message-ID: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>

Hi Everyone,

This is something that's come up in the past and has recently resurfaced 
with a project I've been working on, and that is-- it seems to me as 
though mmfsd never attempts to flush the cache of the block devices its 
writing to (looking at blktrace output seems to confirm this). Is this 
actually the case? I've looked at the gpl headers for linux and I don't 
see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or 
REQ_FLUSH. I'm sure there's other ways to trigger this behavior that 
GPFS may very well be using that I've missed. That's why I'm asking :)

I figure with FPO being pushed as an HDFS replacement using commodity 
drives this feature has *got* to be in the code somewhere.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From oehmes at gmail.com  Fri Sep  8 03:55:14 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 02:55:14 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
Message-ID: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>

I am not sure what exactly you are looking for but all blockdevices are
opened with O_DIRECT , we never cache anything on this layer .

On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov> wrote:

> Hi Everyone,
>
> This is something that's come up in the past and has recently resurfaced
> with a project I've been working on, and that is-- it seems to me as
> though mmfsd never attempts to flush the cache of the block devices its
> writing to (looking at blktrace output seems to confirm this). Is this
> actually the case? I've looked at the gpl headers for linux and I don't
> see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> GPFS may very well be using that I've missed. That's why I'm asking :)
>
> I figure with FPO being pushed as an HDFS replacement using commodity
> drives this feature has *got* to be in the code somewhere.
>
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b510bc51/attachment-0001.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 04:05:42 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:05:42 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
Message-ID: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>

Thanks Sven. I didn't think GPFS itself was caching anything on that 
layer, but it's my understanding that O_DIRECT isn't sufficient to force 
I/O to be flushed (e.g. the device itself might have a volatile caching 
layer). Take someone using ZFS zvol's as NSDs. I can write() all day log 
to that zvol (even with O_DIRECT) but there is absolutely no guarantee 
those writes have been committed to stable storage and aren't just 
sitting in RAM until an fsync() occurs (or some other bio function that 
causes a flush). I also don't believe writing to a SATA drive with 
O_DIRECT will force cache flushes of the drive's writeback cache.. 
although I just tested that one and it seems to actually trigger a scsi 
cache sync. Interesting.

-Aaron

On 9/7/17 10:55 PM, Sven Oehme wrote:
> I am not sure what exactly you are looking for but all blockdevices are 
> opened with O_DIRECT , we never cache anything on this layer .
> 
> 
> On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov 
> <mailto:aaron.s.knister at nasa.gov>> wrote:
> 
>     Hi Everyone,
> 
>     This is something that's come up in the past and has recently resurfaced
>     with a project I've been working on, and that is-- it seems to me as
>     though mmfsd never attempts to flush the cache of the block devices its
>     writing to (looking at blktrace output seems to confirm this). Is this
>     actually the case? I've looked at the gpl headers for linux and I don't
>     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
>     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
>     GPFS may very well be using that I've missed. That's why I'm asking :)
> 
>     I figure with FPO being pushed as an HDFS replacement using commodity
>     drives this feature has *got* to be in the code somewhere.
> 
>     -Aaron
> 
>     --
>     Aaron Knister
>     NASA Center for Climate Simulation (Code 606.2)
>     Goddard Space Flight Center
>     (301) 286-2776
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Fri Sep  8 04:26:02 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:26:02 -0400
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
In-Reply-To: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
References: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
Message-ID: <4a9feeb2-bb9d-8c9a-e506-926d8537cada@nasa.gov>

Sounds like celebratory cake is in order for the users group in a few 
weeks ;)

On 9/6/17 7:30 PM, Glen Corneau wrote:
> Sorry I missed the anniversary of your conception ?(announcement letter) 
> back on August 26th, so I hope you'll accept my belated congratulations 
> on this long and exciting journey!
> 
> https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318
> 
> I remember your parent, PIOFS, as well! ?Ahh the fun times.
> ------------------
> Glen Corneau
> Power Systems
> Washington Systems Center
> gcorneau at us.ibm.com
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From vpuvvada at in.ibm.com  Fri Sep  8 06:00:46 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:30:46 +0530
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
Message-ID: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be 
deleted by user using mmdelsnapshot command directly.  After recovery or 
resync completion they get deleted automatically. For peer snapshots 
deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p 
(undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com)


From:   "Malka, Janusz" <janusz.malka at desy.de>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it 
has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).
 
When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and 
cannot be deleted by user.
 
I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm
 
The advice of the gallery is sought, please.
 
 
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/defb469e/attachment-0001.htm>

From vpuvvada at in.ibm.com  Fri Sep  8 06:21:47 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:51:47 +0530
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF77540C52.9CF7C33D-ON65258195.001D01AB-65258195.001D75FA@notes.na.collabserv.com>

mmafmconfig command should be run on the target path (path specified in 
the afmTarget option when fileset is created). If many filesets are 
sharing the same target (ex independent writer mode) , enable AFM once on 
target path. Run the command at home cluster.

mmafmconifg enable afmTarget 

~Venkat (vpuvvada at in.ibm.com)


From:   "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/07/2017 09:04 PM
Subject:        Re: [gpfsug-discuss] AFM from generic NFS share - 
mmafmconfig
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I think you need to configure a gateway node (use mmchnode to change an 
existing node class to gateway) 
Then use mmafmconfig to setup export server maps on the gateway node.
 
e.g. 
mmafmconfig ?add ?mapping1? ?export-map ?nfsServerIP?/?GatewayNode? 
(double quotes not required)
 
mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode
 
 
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
 
If I have an AFM setup where the home is located on a generic NFS share, 
let?s say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on 
/volume/share ?
I can mount the share as a plain old NFS mount in order to run this 
operation, before I create the cache side fileset.
 
Apologies if I am being dumb as an ox here, and I deserve to be slapped in 
the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=kKlSEJqmVE6q8Qt02JNaDLsewp13C0yRAmlfc_djRkk&s=JIbuXlCiReZx3ws5__6juuGC-sAqM74296BuyzgyNYg&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/67635871/attachment-0001.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:04:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:04:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <0CBB283A-A0A9-4FC9-A1CD-9E019D74CDB9@ocf.co.uk>

I am still populating your lot 2 response - it is split across 3 word docs and a whole heap of emails so easier for me to keep going - I dropped u off a lot of emails to save filling your inbox :-)

Could you poke around other tenders for the portal question please?

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From john.hearns at asml.com  Fri Sep  8 08:26:01 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 07:26:01 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
Message-ID: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/f6e9c311/attachment-0001.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:33:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:33:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From Sandra.McLaughlin at astrazeneca.com  Fri Sep  8 10:12:02 2017
From: Sandra.McLaughlin at astrazeneca.com (McLaughlin, Sandra M)
Date: Fri, 8 Sep 2017 09:12:02 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________

AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://www.astrazeneca.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/af4d206c/attachment-0001.htm>

From john.hearns at asml.com  Fri Sep  8 11:57:14 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 10:57:14 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB14504373DC64177A95242B9588950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Sandra,
   Thankyou for the help.  I have a support ticket outstanding, and will see what is suggested.
I am sure this is a simple matter of deleting the fileset as you say!

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b56be110/attachment-0001.htm>

From kkr at lbl.gov  Fri Sep  8 11:58:05 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Fri, 8 Sep 2017 03:58:05 -0700
Subject: [gpfsug-discuss] Hold the Date - Spectrum Scale Day @ HPCXXL
 (Sept 2017, NYC)
In-Reply-To: <6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
References: <A52508C1-582D-4024-9313-F06EE774D0E9@lbl.gov>
	<BF39E3E7-D384-460A-816F-F55CD0520DEE@lbl.gov>
	<6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
Message-ID: <D4B42140-D3C3-4D95-B055-D988293972BC@lbl.gov>

Hello,

	The agenda for the GPFS Day during HPCXXL is fairly fleshed out here:

http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>

	See notes on registration below, which is free but required. Use the HPCXXL registration form, which has a $0 GPFS Day registration option.

	Hope to see some of you there.

Best,
Kristy


> On Aug 21, 2017, at 3:33 PM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> If you plan on attending the GPFS Day, please use the HPCXXL registration form (link to Eventbrite registration at the link below). The GPFS day is a free event, but you *must* register so we can make sure there are enough seats and food available. 
> 
> If you would like to speak or suggest a topic, please let me know.
> 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> The agenda is still being worked on, here are some likely topics:
> 
> --RoadMap/Updates
> --"New features - New Bugs? (Julich)
> --GPFS + Openstack (CSCS) 
> --ORNL Update on Spider3-related GPFS work
> --ANL Site Update
> --File Corruption Session
> 
> Best,
> Kristy
> 
>> On Aug 8, 2017, at 11:33 AM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>> 
>> Hello,
>> 
>> 	The GPFS Day of the HPCXXL conference is confirmed for Thursday, September 28th. Here is an updated URL, the agenda and registration are still being put together http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>. The GPFS Day will require registration, so we can make sure there is enough room (and coffee/food) for all attendees ?however, there will be no registration fee if you attend the GPFS Day only.
>> 
>> 	I?ll send another update when the agenda is closer to settled.
>> 
>> Cheers,
>> Kristy
>> 
>>> On Jul 7, 2017, at 3:32 PM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>>> 
>>> Hello,
>>> 
>>>   More details will be provided as they become available, but just so you can make a placeholder on your calendar, there will be a Spectrum Scale Day the week of September 25th - 29th, likely Thursday, September 28th. 
>>> 
>>>   This will be a part of the larger HPCXXL meeting (https://www.spxxl.org/?q=New-York-City-2017 <https://www.spxxl.org/?q=New-York-City-2017>). You may recall this group was formerly called SPXXL and the website is in the process of transitioning to the new name (and getting a new certificate). You will be able to attend *just* the Spectrum Scale day if that is the only portion of the event you would like to attend. 
>>> 
>>>   The NYC location is great for Spectrum Scale events because many IBMers, including developers, can come in from Poughkeepsie.
>>> 
>>>   More as we get closer to the date and details are settled.
>>> 
>>> Cheers,
>>> Kristy
>> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/110fea2d/attachment-0001.htm>

From hpc.ken.tw25qn at gmail.com  Fri Sep  8 19:30:32 2017
From: hpc.ken.tw25qn at gmail.com (Ken Atkinson)
Date: Fri, 8 Sep 2017 19:30:32 +0100
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
	<93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>
	<CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
Message-ID: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>

Not on too many G&Ts Georgina?
How are things.
Ken Atkinson

On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org"
<gpfsug-discuss-request at spectrumscale.org> wrote:
>
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
>
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
>
> I had similar issue, I had to recover connection to home
>
>
> From: "John Hearns" <john.hearns at asml.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Sent: Thursday, 7 September, 2017 17:52:19
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
>
>
>
> Firmly lining myself up for a smack round the chops with a wet haddock?
>
> I try to delete an AFM cache fileset which I create da few days ago (it
has an NFS home)
>
> Mmdelfileset responds that :
>
> Fileset obfuscated has 1 fileset snapshot(s).
>
>
>
> When I try to delete the snapshot:
>
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and
cannot be deleted by user.
>
>
>
> I find this reference, which is about as useful as a wet haddock:
>
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm |
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
>
>
>
> The advice of the gallery is sought, please.
>
>
>
>
>
>
> -- The information contained in this communication and any attachments is
confidential and may be privileged, and is for the sole use of the intended
recipient(s). Any unauthorized review, use, disclosure or distribution is
prohibited. Unless explicitly stated otherwise in the body of this
communication or the attachment thereto (if any), the information is
provided on an AS-IS basis without any express or implied warranties or
liabilities. To the extent you are relying on this information, you are
doing so at your own risk. If you are not the intended recipient, please
notify the sender immediately by replying to this message and destroy all
copies of this message and any attachments. Neither the sender nor the
company/group of companies he or she represents shall be liable for the
proper and complete transmission of the information contained in this
communication, or for any delay in its receipt.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/6da4c433/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.
0074DFDC at notes.na.collabserv.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/04778952/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/03c6bc78/attachment-0001.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 22:14:04 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Fri, 8 Sep 2017 17:14:04 -0400
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov>
	<OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
Message-ID: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>

Interesting! Thank you for the explanation.

This makes me wish GPFS had a client access model that more closely
mimicked parallel NAS, specifically for this reason. That then got me
wondering about pNFS support. I've not been able to find much about that
but in theory Ganesha supports pNFS. Does anyone know of successful pNFS
testing with GPFS and if so how one would set up such a thing?

-Aaron

On 08/25/2017 06:41 PM, IBM Spectrum Scale wrote:
>
> Hi Aaron,
>
> If cluster A uses the mmauth command to grant a file system read-only
> access to a remote cluster B, nodes on cluster B can only mount that
> file system with read-only access. But the only checking being done at
> the RPC level is the TLS authentication. This should prevent non-root
> users from initiating RPCs, since TLS authentication requires access
> to the local cluster's private key. However, a root user on cluster B,
> having access to cluster B's private key, might be able to craft RPCs
> that may allow one to work around the checks which are implemented at
> the file system level.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks
> Forum at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please
> contact 1-800-237-5511 in the United States or your local IBM Service
> Center in other countries.
>
> The forum is informally monitored as time permits and should not be
> used for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for Aaron Knister ---08/21/2017 11:04:06 PM---Hi
> Everyone, I have a theoretical question about GPFS multiAaron Knister
> ---08/21/2017 11:04:06 PM---Hi Everyone, I have a theoretical question
> about GPFS multiclusters and security.
>
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date: 08/21/2017 11:04 PM
> Subject: [gpfsug-discuss] multicluster security
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
>
> ------------------------------------------------------------------------
>
>
>
> Hi Everyone,
>
> I have a theoretical question about GPFS multiclusters and security.
> Let's say I have clusters A and B. Cluster A is exporting a filesystem
> as read-only to cluster B.
>
> Where does the authorization burden lay? Meaning, does the security rely
> on mmfsd in cluster B to behave itself and enforce the conditions of the
> multi-cluster export? Could someone using the credentials on a
> compromised node in cluster B just start sending arbitrary nsd
> read/write commands to the nsds from cluster A (or something along those
> lines)? Do the NSD servers in cluster A do any sort of sanity or
> security checking on the I/O requests coming from cluster B to the NSDs
> they're serving to exported filesystems?
>
> I imagine any enforcement would go out the window with shared disks in a
> multi-cluster environment since a compromised node could just "dd" over
> the LUNs.
>
> Thanks!
>
> -Aaron
>
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=oK_bEPbjuD7j6qLTHbe7HM4ujUlpcNYtX3tMW2QC7_w&s=BliMQ0pToLIIiO1jfyUp2Q3icewcONrcmHpsIj_hMtY&e= 
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment-0001.gif>

From oehmes at gmail.com  Fri Sep  8 22:21:00 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 21:21:00 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
	<f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
Message-ID: <CALssuR2hvCFnoecUy4G42fDW6s6buQ5BHrNAd3y+xjQEO509qA@mail.gmail.com>

Hi,

the code assumption is that the underlying device has no volatile write
cache, i was absolute sure we have that somewhere in the FAQ, but i
couldn't find it, so i will talk to somebody to correct this.
if i understand
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
correct
one could enforce this by setting REQ_FUA, but thats not explicitly set
today, at least i can't see it. i will discuss this with one of our devs
who owns this code and come back.

sven


On Thu, Sep 7, 2017 at 8:05 PM Aaron Knister <aaron.s.knister at nasa.gov>
wrote:

> Thanks Sven. I didn't think GPFS itself was caching anything on that
> layer, but it's my understanding that O_DIRECT isn't sufficient to force
> I/O to be flushed (e.g. the device itself might have a volatile caching
> layer). Take someone using ZFS zvol's as NSDs. I can write() all day log
> to that zvol (even with O_DIRECT) but there is absolutely no guarantee
> those writes have been committed to stable storage and aren't just
> sitting in RAM until an fsync() occurs (or some other bio function that
> causes a flush). I also don't believe writing to a SATA drive with
> O_DIRECT will force cache flushes of the drive's writeback cache..
> although I just tested that one and it seems to actually trigger a scsi
> cache sync. Interesting.
>
> -Aaron
>
> On 9/7/17 10:55 PM, Sven Oehme wrote:
> > I am not sure what exactly you are looking for but all blockdevices are
> > opened with O_DIRECT , we never cache anything on this layer .
> >
> >
> > On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov
> > <mailto:aaron.s.knister at nasa.gov>> wrote:
> >
> >     Hi Everyone,
> >
> >     This is something that's come up in the past and has recently
> resurfaced
> >     with a project I've been working on, and that is-- it seems to me as
> >     though mmfsd never attempts to flush the cache of the block devices
> its
> >     writing to (looking at blktrace output seems to confirm this). Is
> this
> >     actually the case? I've looked at the gpl headers for linux and I
> don't
> >     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> >     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> >     GPFS may very well be using that I've missed. That's why I'm asking
> :)
> >
> >     I figure with FPO being pushed as an HDFS replacement using commodity
> >     drives this feature has *got* to be in the code somewhere.
> >
> >     -Aaron
> >
> >     --
> >     Aaron Knister
> >     NASA Center for Climate Simulation (Code 606.2)
> >     Goddard Space Flight Center
> >     (301) 286-2776
> >     _______________________________________________
> >     gpfsug-discuss mailing list
> >     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
> >     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b2985540/attachment-0001.htm>

From olaf.weiser at de.ibm.com  Sat Sep  9 09:05:31 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Sat, 9 Sep 2017 10:05:31 +0200
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov><OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
	<529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
Message-ID: <OF2A0CBA52.519123DC-ONC1258196.002ABF0D-C1258196.002C73B4@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment-0001.gif>

From aaron.s.knister at nasa.gov  Mon Sep 11 01:43:56 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:43:56 -0400
Subject: [gpfsug-discuss] tuning parameters question
Message-ID: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>

Hi All (but mostly Sven),

I stumbled across this great gem:

files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf

and I'm wondering which, if any, of those tuning parameters are still 
relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
particularly ugly and the storage doesn't appear to be bottlenecked.

I see a lot of waiters like these:

Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'

and I'm wondering if there's anything immediate one would suggest to 
help with that.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Mon Sep 11 01:50:39 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:50:39 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
Message-ID: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>

As an aside, my initial attempt was to use Ganesha via CES but the 
performance was significantly worse than CNFS for this workload. The 
docs seem to suggest that CNFS performs better for metadata intensive 
workloads which certainly seems to fit the bill here.

-Aaron

On 9/10/17 8:43 PM, Aaron Knister wrote:
> Hi All (but mostly Sven),
> 
> I stumbled across this great gem:
> 
> files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
> 
> and I'm wondering which, if any, of those tuning parameters are still 
> relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
> exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
> particularly ugly and the storage doesn't appear to be bottlenecked.
> 
> I see a lot of waiters like these:
> 
> Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> 
> and I'm wondering if there's anything immediate one would suggest to 
> help with that.
> 
> -Aaron
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From stefan.dietrich at desy.de  Mon Sep 11 08:40:14 2017
From: stefan.dietrich at desy.de (Dietrich, Stefan)
Date: Mon, 11 Sep 2017 09:40:14 +0200 (CEST)
Subject: [gpfsug-discuss] Switch from IPoIB connected mode to datagram with
	ESS 5.2.0?
Message-ID: <743361352.9211728.1505115614463.JavaMail.zimbra@desy.de>

Hello,

during reading the upgrade docs for ESS 5.2.0, I noticed a change in the IPoIB mode.
Now it specifies, that datagram (CONNECTED_MODE=no) instead of connected mode should be used.
All earlier versions used connected mode.

I am wondering about the reason for this change?
Or is this only relevant for bonded IPoIB interfaces?

Regards,
Stefan

--
------------------------------------------------------------------------
Stefan Dietrich            Deutsches Elektronen-Synchrotron (IT-Systems)
                        Ein Forschungszentrum der Helmholtz-Gemeinschaft
                                                            Notkestr. 85
phone:  +49-40-8998-4696                                   22607 Hamburg
e-mail: stefan.dietrich at desy.de                                  Germany
------------------------------------------------------------------------


From john.hearns at asml.com  Mon Sep 11 08:41:54 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 11 Sep 2017 07:41:54 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB1450707CD4CB6EECBE7D1F0B88680@HE1PR02MB1450.eurprd02.prod.outlook.com>

Thankyou all for advice.
The ?-p? option was the fix here (thankyou to IBM support).


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/c5cba04f/attachment-0001.htm>

From olaf.weiser at de.ibm.com  Mon Sep 11 09:11:15 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Mon, 11 Sep 2017 10:11:15 +0200
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
Message-ID: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/28008568/attachment-0001.htm>

From ed.swindelles at uconn.edu  Mon Sep 11 16:49:15 2017
From: ed.swindelles at uconn.edu (Swindelles, Ed)
Date: Mon, 11 Sep 2017 15:49:15 +0000
Subject: [gpfsug-discuss] UConn hiring GPFS administrator
Message-ID: <D1C812AF-5A0D-473C-91BC-A4F97764ED5B@uconn.edu>

The University of Connecticut is hiring three full time, permanent technical positions for its HPC team on the Storrs campus. One of these positions is focused on storage administration, including a GPFS cluster. I would greatly appreciate it if you would forward this announcement to contacts of yours who may have an interest in these positions. Here are direct links to the job descriptions and applications:

HPC Storage Administrator
http://s.uconn.edu/3tx

HPC Systems Administrator (2 positions to be filled)
http://s.uconn.edu/3tw

Thank you,

--
Ed Swindelles
Team Lead for Research Technology
University of Connecticut
860-486-4522

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/0c8c61d8/attachment-0001.htm>

From aaron.s.knister at nasa.gov  Mon Sep 11 23:15:10 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Mon, 11 Sep 2017 18:15:10 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
	<OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
Message-ID: <9de64193-c60c-8ee1-b681-6cfe3993772b@nasa.gov>

Thanks, Olaf. I ended up un-setting a bunch of settings that are now 
auto-tuned (worker1threads, worker3threads, etc.) and just set 
workerthreads as you suggest. That combined with increasing 
maxfilestocache to above the max concurrent open file threshold of the 
workload got me consistently with in 1%-3% of the performance of the 
same storage hardware running btrfs instead of GPFS. I think that's 
pretty darned good considering the additional complexity GPFS has over 
btrfs of being a clustered filesystem. Plus I now get NFS server 
failover for very little effort and without having to deal with corosync 
or pacemaker.

-Aaron

On 9/11/17 4:11 AM, Olaf Weiser wrote:
> Hi Aaron ,
> 
> 0,0009 s response time for your meta data IO ... seems to be a very 
> good/fast storage BE.. which is hard to improve..
> you can raise the parallelism a bit for accessing metadata , but if this 
> will help to improve your "workload" is not assured
> 
> The worker3threads parameter specifies the number of threads to use for 
> inode prefetch. Usually , I would suggest, that you should not touch 
> single parameters any longer. By the great improvements of the last few 
> releases.. GPFS can calculate / retrieve the right settings 
> semi-automatically...
> You only need to set simpler "workerThreads" ..
> 
> But in your case , you can see, if this more specific value will help 
> you out .
> 
> depending on your blocksize and average filesize .. you may see 
> additional improvements when tuning nfsPrefetchStrategy , which tells 
> GPFS to consider all IOs wihtin */N/* blockboundaries as sequential ?and 
> starts prefetch
> 
> l.b.n.t. set ignoreprefetchLunCount to yes .. (if not already done) . 
> this helps GPFS to use all available workerThreads
> 
> cheers
> olaf
> 
> 
> 
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: <gpfsug-discuss at spectrumscale.org>
> Date: 09/11/2017 02:50 AM
> Subject: Re: [gpfsug-discuss] tuning parameters question
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------------------------------------------------
> 
> 
> 
> As an aside, my initial attempt was to use Ganesha via CES but the
> performance was significantly worse than CNFS for this workload. The
> docs seem to suggest that CNFS performs better for metadata intensive
> workloads which certainly seems to fit the bill here.
> 
> -Aaron
> 
> On 9/10/17 8:43 PM, Aaron Knister wrote:
>  > Hi All (but mostly Sven),
>  >
>  > I stumbled across this great gem:
>  >
>  > files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
>  >
>  > and I'm wondering which, if any, of those tuning parameters are still
>  > relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm
>  > exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is
>  > particularly ugly and the storage doesn't appear to be bottlenecked.
>  >
>  > I see a lot of waiters like these:
>  >
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 2881
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26231
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26146
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 18637
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25013
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 27879
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26553
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25334
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25337
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  >
>  > and I'm wondering if there's anything immediate one would suggest to
>  > help with that.
>  >
>  > -Aaron
>  >
> 
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From zacekm at img.cas.cz  Tue Sep 12 10:40:35 2017
From: zacekm at img.cas.cz (Michal Zacek)
Date: Tue, 12 Sep 2017 11:40:35 +0200
Subject: [gpfsug-discuss] Wrong nodename after server restart
Message-ID: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>

Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)


From secretary at gpfsug.org  Tue Sep 12 15:22:41 2017
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Tue, 12 Sep 2017 15:22:41 +0100
Subject: [gpfsug-discuss] SS UG UK 2018
Message-ID: <c3a112a27d1dcdd9ec0e3cba8e11be1b@webmail.gpfsug.org>

 
Dear all, 

A date for your diary, #SSUG18 in the UK will be taking place on April
18th & 19th 2018. Please mark it in your diaries now! 

We'll confirm other details (venue, agenda etc.) nearer the time, but
the date is confirmed. 

Thanks, 
-- 

Claire O'Toole
Spectrum Scale/GPFS User Group Secretary
+44 (0)7508 033896
www.spectrumscaleug.org
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/5db4ccf8/attachment-0001.htm>

From scale at us.ibm.com  Tue Sep 12 16:01:21 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 12 Sep 2017 11:01:21 -0400
Subject: [gpfsug-discuss] Wrong nodename after server restart
In-Reply-To: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
References: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
Message-ID: <OFD1EB20F8.CE480295-ON85258199.0052675E-85258199.0052859A@notes.na.collabserv.com>

Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name 
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local 
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local 
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local 
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local 
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/aba30a31/attachment-0001.htm>

From daniel.kidger at uk.ibm.com  Tue Sep 12 16:36:06 2017
From: daniel.kidger at uk.ibm.com (Daniel Kidger)
Date: Tue, 12 Sep 2017 15:36:06 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>
Message-ID: <OF1C8113B1.DE118C7C-ON00258199.0055B3FB-1505230566579@notes.na.collabserv.com>

Well George is not the only one to have replied to the list with a one to one message.
?

Remember folks, this mailing list has a *lot* of people on it.
Hope my message is last that forgets who is in the 'To' field.

Daniel

Daniel Kidger 
Technical Sales Specialist, IBM UK
IBM Spectrum Storage Software
daniel.kidger at uk.ibm.com
+44 (0)7818 522266


> On 8 Sep 2017, at 19:30, Ken Atkinson <hpc.ken.tw25qn at gmail.com> wrote:
> 
> Not on too many G&Ts Georgina?
> How are things.
> Ken Atkinson
> 
> On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:
> Apologies All, slip of the keyboard and not a comment on GPFS!
> 
> Sent from my iPhone
> 
> > On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> >
> > Send gpfsug-discuss mailing list submissions to
> >    gpfsug-discuss at spectrumscale.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > or, via email, send a message with subject or body 'help' to
> >    gpfsug-discuss-request at spectrumscale.org
> >
> > You can reach the person managing the list at
> >    gpfsug-discuss-owner at spectrumscale.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of gpfsug-discuss digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
> >   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> > From: "Malka, Janusz" <janusz.malka at desy.de>
> > To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> > Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> > Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> > Content-Type: text/plain; charset="utf-8"
> >
> > I had similar issue, I had to recover connection to home
> >
> >
> > From: "John Hearns" <john.hearns at asml.com>
> > To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> > Sent: Thursday, 7 September, 2017 17:52:19
> > Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
> >
> >
> >
> > Firmly lining myself up for a smack round the chops with a wet haddock?
> >
> > I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
> >
> > Mmdelfileset responds that :
> >
> > Fileset obfuscated has 1 fileset snapshot(s).
> >
> >
> >
> > When I try to delete the snapshot:
> >
> > Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.
> >
> >
> >
> > I find this reference, which is about as useful as a wet haddock:
> >
> > [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
> >
> >
> >
> > The advice of the gallery is sought, please.
> >
> >
> >
> >
> >
> >
> > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 7 Sep 2017 21:16:34 +0000
> > From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> > To: gpfsug-discuss at spectrumscale.org
> > Cc: gpfsug-discuss at spectrumscale.org
> > Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> > Message-ID:
> >    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
> >
> > Content-Type: text/plain; charset="us-ascii"
> >
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> >
> > ------------------------------
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> > End of gpfsug-discuss Digest, Vol 68, Issue 13
> > **********************************************
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/65278177/attachment-0001.htm>

From jonathan.mills at nasa.gov  Tue Sep 12 17:06:23 2017
From: jonathan.mills at nasa.gov (Jonathan Mills)
Date: Tue, 12 Sep 2017 12:06:23 -0400 (EDT)
Subject: [gpfsug-discuss] Support for SLES 12 SP3
Message-ID: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>

SLES 12 SP3 has been released.  And for what it?s worth, there does not 
appear to be substantial changes in either kernel or glibc as compared to 
SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, 
while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I 
wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported 
version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this 
fall would prefer to do so on the latest rev of our OS, as opposed to one 
that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov
NASA GSFC / NCCS HPC (606.2)
Bldg 28, Rm. S230 / c. 252-412-5710

From Greg.Lehmann at csiro.au  Wed Sep 13 00:12:55 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Tue, 12 Sep 2017 23:12:55 +0000
Subject: [gpfsug-discuss] Support for SLES 12 SP3
In-Reply-To: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
References: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
Message-ID: <67f390a558244c41b154a7a6a9e5efe8@exch1-cdc.nexus.csiro.au>

+1. We are interested in SLES 12 SP3 too. 

BTW had anybody done any comparisons of SLES 12 SP2 (4.4) kernel vs RHEL 7.3 in terms of GPFS IO performance? I would think the 4.4 kernel might give it an edge. I'll probably get around to comparing them myself one day, but if anyone else has some numbers...

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Mills
Sent: Wednesday, 13 September 2017 2:06 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Support for SLES 12 SP3

SLES 12 SP3 has been released.  And for what it?s worth, there does not appear to be substantial changes in either kernel or glibc as compared to SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this fall would prefer to do so on the latest rev of our OS, as opposed to one that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov NASA GSFC / NCCS HPC (606.2) Bldg 28, Rm. S230 / c. 252-412-5710


From scale at us.ibm.com  Wed Sep 13 22:33:30 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 13 Sep 2017 17:33:30 -0400
Subject: [gpfsug-discuss] Fw:  Wrong nodename after server restart
Message-ID: <OFA4664B84.05767CE8-ON8525819A.00764969-8525819A.00766C39@us.ibm.com>

----- Forwarded by Eric Agar/Poughkeepsie/IBM on 09/13/2017 05:32 PM -----

From:   IBM Spectrum Scale/Poughkeepsie/IBM
To:     Michal Zacek <zacekm at img.cas.cz>
Date:   09/13/2017 05:29 PM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart
Sent by:        Eric Agar


Hello Michal,

It should not be necessary to delete whale.img.cas.cz and rename it.  But, 
that is an option you can take, if you prefer it. If you decide to take 
that option, please see the last paragraph of this response.

The confusion starts at the moment a node is added to the active cluster 
where the new node does not have the same common domain suffix as the 
nodes that were already in the cluster.  The confusion increases when the 
GPFS daemons on some nodes, but not all nodes, are recycled.  Doing 
mmshutdown -a, followed by mmstartup -a, once after the new node has been 
added allows all GPFS daemons on all nodes to come up at the same time and 
arrive at the same answer to the question, "what is the common domain 
suffix for all the nodes in the cluster now?"  In the case of your 
cluster, the answer will be "the common domain suffix is the empty string" 
or, put another way, "there is no common domain suffix"; that is okay, as 
long as all the GPFS daemons come to the same conclusion.

After you recycle the cluster, you can check to make sure all seems well 
by running "tsctl shownodes up" on every node, and make sure the answer is 
correct on each node.

If the mmshutdown -a / mmstartup -a recycle works, the problem should not 
recur with the current set of nodes in the cluster.  Even as individual 
GPFS daemons are recycled going forward, they should still understand the 
cluster's nodes have no common domain suffix.

However, I can imagine sequences of events that would cause the issue to 
occur again after nodes are deleted or added to the cluster while the 
cluster is active.  For example, if whale.img.cas.cz were to be deleted 
from the current cluster, that action would restore the cluster to having 
a common domain suffix of ".img.local", but already running GPFS daemons 
would not realize it.  If the delete of whale occurred while the cluster 
was active, subsequent recycling of the GPFS daemon on just a subset of 
the nodes would cause the recycled daemons to understand the common domain 
suffix to now be ".img.local".  But, daemons that had not been recycled 
would still think there is no common domain suffix.  The confusion would 
occur again.

On the other hand, adding and deleting nodes to/from the cluster should 
not cause the issue to occur again as long as the cluster continues to 
have the same (in this case, no) common domain suffix.

If you decide to delete whale.img.case.cz, rename it to have the 
".img.local" domain suffix, and add it back to the cluster, it would be 
best to do so after all the GPFS daemons are shut down with mmshutdown -a, 
but before any of the daemons are restarted with mmstartup.  This would 
allow all the subsequent running daemons to come to the conclusion that 
".img.local" is now the common domain suffix.

I hope this helps.

Regards,
Eric Agar

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     IBM Spectrum Scale <scale at us.ibm.com>
Date:   09/13/2017 03:42 AM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart


Hello
yes you are correct, Whale was added two days a go. It's necessary to 
delete whale.img.cas.cz from cluster before mmshutdown/mmstartup? If the 
two domains may cause problems in the future I can rename whale (and all 
planed nodes) to img.local suffix.
Many thanks for the prompt reply. 
Regards
Michal

Dne 12.9.2017 v 17:01 IBM Spectrum Scale napsal(a):
Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
 1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:        Michal Zacek <zacekm at img.cas.cz>
To:        gpfsug-discuss at spectrumscale.org
Date:        09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
  GPFS cluster name:         gpfscl1.img.local
  GPFS cluster id:           17792677515884116443
  GPFS UID domain:           img.local
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
   1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
   2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
   3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
   4   tau.img.local          192.168.1.248 tau.img.local
   5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
   6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
   8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
  192.168.20.63   gpfs-n3
  192.168.20.61   gpfs-n1
  192.168.20.62   gpfs-n2
  192.168.1.248   tau
  192.168.20.64   gpfs-n4.img.local
  192.168.20.60   gpfs-quorum.img.local
  147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 

Michal ???ek | Information Technologies 
+420 296 443 128 
+420 296 443 333 
michal.zacek at img.cas.cz 
www.img.cas.cz 
Institute of Molecular Genetics of the ASCR, v. v. i., V?de?sk? 1083, 142 
20 Prague 4, Czech Republic 
ID: 68378050 | VAT ID: CZ68378050 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1997 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment-0001.png>

From valdis.kletnieks at vt.edu  Thu Sep 14 01:18:51 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Wed, 13 Sep 2017 20:18:51 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
Message-ID: <52657.1505348331@turing-police.cc.vt.edu>

So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?  	I could understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy that I
run that *does* have a big SHOW clause takes almost the same amount of time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?


From oehmes at gmail.com  Thu Sep 14 01:28:46 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Thu, 14 Sep 2017 00:28:46 +0000
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <CALssuR3RMwe3tqUtYSViZtc20Q5oJOPtw=QtYO+KgpGJCeNhTw@mail.gmail.com>

can you please share the entire command line you are using ?
also gpfs version, mmlsconfig output would help as well as if this is a
shared storage filesystem or a system using local disks.

thx. Sven

On Wed, Sep 13, 2017 at 5:19 PM <valdis.kletnieks at vt.edu> wrote:

> So we have a number of very similar policy files that get applied for file
> migration etc. And they vary drastically in the runtime to process,
> apparently
> due to different selections on whether to do the work in parallel.
>
> Running a set of rules with 'mmapplypolicy -I defer' that look like this:
>
> RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
> THRESHOLD(0,100,0)
> WEIGHT(FILE_SIZE)
> TO POOL 'VBI_FILES'
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
> evaluate the policy in 3 more minutes.
>
> However, this takes a bit over 30 minutes for the scan and another 20 for
> sorting and policy evaluation over the same set of filesets:
>
> RULE 'VBI_FILES_RULE' LIST 'pruned_files'
> THRESHOLD(90,80)
> WEIGHT(FILE_SIZE)
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> even though the output is essentially identical.  Why is LIST so much more
> expensive than 'MIGRATE" with '-I defer'?       I could understand if I
> had an
> expensive SHOW clause, but there isn't one here (and a different policy
> that I
> run that *does* have a big SHOW clause takes almost the same amount of
> time as
> the minimal LIST)....
>
> I'm thinking that it has *something* to do with the MIGRATE job outputting:
>
> [I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0
> files scanned.
> (...)
> [I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0
> records scanned.
>
> while the LIST job says:
>
> [I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
> (...)
> [I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.
>
> (Both output the same message during the 'Directory entries scanned: 0.'
> phase, but I suspect MIGRATE is multi-threading that part as well, as it
> completes much faster).
>
> What's the controlling factor in mmapplypolicy's decision whether or
> not to parallelize the policy?
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/7454f8c3/attachment-0001.htm>

From kh.atmane at gmail.com  Thu Sep 14 13:49:55 2017
From: kh.atmane at gmail.com (atmane)
Date: Thu, 14 Sep 2017 13:49:55 +0100
Subject: [gpfsug-discuss] Disk change problem in gss GNR
Message-ID: <op.y6j29hzdpgw25x@pc-atm>

dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


-- 
-- 
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :  
a.khiredine at meteo.dz


From makaplan at us.ibm.com  Thu Sep 14 19:55:39 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 14:55:39 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>

Read the doc again.  Specify both -g and -N options on the command line to 
get fully parallel directory and inode/policy scanning.

I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ... 
Perhaps premigrate everything (that matches the other conditions)?

You are correct about
I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

If you don't see messages like that, you did not specify both -N and -g.


From:   valdis.kletnieks at vt.edu
To:     gpfsug-discuss at spectrumscale.org
Date:   09/13/2017 08:19 PM
Subject:        [gpfsug-discuss] mmapplypolicy run time weirdness..
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, 
apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?                I could 
understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy 
that I
run that *does* have a big SHOW clause takes almost the same amount of 
time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job 
outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=SGbwD3m5mZ16_vwIFK8Ym48lwdF1tVktnSao0a_tkfA&s=sLt9AtZiZ0qZCKzuQoQuyxN76_R66jfAwQxdIY-w2m0&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/fe03e5b6/attachment-0001.htm>

From valdis.kletnieks at vt.edu  Thu Sep 14 21:09:40 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Thu, 14 Sep 2017 16:09:40 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
References: <52657.1505348331@turing-police.cc.vt.edu>
	<OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
Message-ID: <26551.1505419780@turing-police.cc.vt.edu>

On Thu, 14 Sep 2017 14:55:39 -0400, "Marc A Kaplan" said:

> Read the doc again.  Specify both -g and -N options on the command line to
> get fully parallel directory and inode/policy scanning.

Yeah, figured that out, with help from somebody. :)

> I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ...
> Perhaps premigrate everything (that matches the other conditions)?

Yeah, it's actually feeding to LTFS/EE - where we premigrate everything that matches to tape.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/a3a0be57/attachment-0001.sig>

From makaplan at us.ibm.com  Thu Sep 14 22:13:59 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 17:13:59 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <26551.1505419780@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu><OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
	<26551.1505419780@turing-police.cc.vt.edu>
Message-ID: <OF2DE86920.DAFFA479-ON8525819B.00744BD2-8525819B.0074A50A@notes.na.collabserv.com>

BTW - we realize that mmapplypolicy -g and -N is a "gotcha" for some 
(many?) customer/admins -- so we're considering ways to make that easier 
-- but without "breaking" scripts and callbacks and what-have-yous that 
might depend on the current/old defaults...  Always a balancing act -- 
considering that GPFS ne Spectrum Scale just hit its 20th birthday (by IBM 
reckoning)

--marc of GPFS 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/5cd7446b/attachment-0001.htm>

From neil.wilson at metoffice.gov.uk  Fri Sep 15 11:47:19 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Fri, 15 Sep 2017 10:47:19 +0000
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

Hi,

Does anyone know how to use "mmperfmon config update" to get the "hostname =" field in the ZImonSensors.cfg file populated with the hostname of the node that it's been installed on?
By default the field is empty and for some reason on our cluster it doesn't transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk<mailto:neil.wilson at metoffice.gov.uk>   Website www.metoffice.gov.uk<http://www.metoffice.gov.uk/>
Our magazine Barometer is now available online at http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7eede15d/attachment-0001.htm>

From john.hearns at asml.com  Fri Sep 15 16:37:13 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 15 Sep 2017 15:37:13 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
Message-ID: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7cb2030f/attachment-0001.htm>

From Greg.Lehmann at csiro.au  Mon Sep 18 01:14:52 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Mon, 18 Sep 2017 00:14:52 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
In-Reply-To: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <5d1811f4d6ad4605bd2a7c7441f4dd1b@exch1-cdc.nexus.csiro.au>

I am interested too, so maybe keep it on list?

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: Saturday, 16 September 2017 1:37 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/78f2c389/attachment-0001.htm>

From richard.lefebvre+gpfsug at calculquebec.ca  Mon Sep 18 20:16:57 2017
From: richard.lefebvre+gpfsug at calculquebec.ca (Richard Lefebvre)
Date: Mon, 18 Sep 2017 15:16:57 -0400
Subject: [gpfsug-discuss] How to find which node is generating high iops in
	a GPFS 3.5
Message-ID: <CAHuHHxpZNLFObOnBA06+vMSBzzh9gdGoZdBNTFLXx83+iqSeyw@mail.gmail.com>

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/e8ed6148/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 18 20:27:49 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 18 Sep 2017 19:27:49 +0000
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
Message-ID: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>

You do realize 3.5 is out of service, correct? You should be looking at upgrading :-)

Catching this is real time, when you have a large number of nodes is going to be tough. How you recognizing that the file system is overloaded? Waiters? Looking at which nodes/NSDs have the longest/largest waiters may provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its raw state, but it does provide some good stats on a per file system basis. But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that generate a lot of iops on the large file system but I cannot find the right tool to find which node is the source. I'm guessing under 4.2.X, there are now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/3862d74b/attachment-0001.htm>

From scale at us.ibm.com  Tue Sep 19 07:47:42 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:47:42 +0800
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF1C1CA31D.FDBB5467-ON482581A0.002535A5-482581A0.002553BB@notes.na.collabserv.com>

Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/15/2017 06:48 PM
Subject:	[gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk   Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment-0001.gif>

From scale at us.ibm.com  Tue Sep 19 07:54:50 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:54:50 +0800
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
In-Reply-To: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
References: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
Message-ID: <OF9FB0FAC8.E6E8D3A0-ON482581A0.0025AE68-482581A0.0025FAF5@notes.na.collabserv.com>

Hi Richard,

Is any of tool in
https://www.ibm.com/developerworks/community/wikis/home?_escaped_fragment_=/wiki/General%2520Parallel%2520File%2520System%2520%2528GPFS%2529/page/Display%2520per%2520node%2520IO%2520statstics
 can help you?

BTW, I agree with Bob that 3.5 is out-of-service. Without an extended
service, you should consider to upgrade your cluster as soon as possible.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/19/2017 03:28 AM
Subject:	Re: [gpfsug-discuss] How to find which node is generating high
            iops in a GPFS 3.5
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


You do realize 3.5 is out of service, correct? You should be looking at
upgrading :-)

Catching this is real time, when you have a large number of nodes is going
to be tough. How you recognizing that the file system is overloaded?
Waiters? Looking at which nodes/NSDs have the longest/largest waiters may
provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its
raw state, but it does provide some good stats on a per file system basis.
But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard
Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating
high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=AYwUf61wv-Hq63KU7veQSxavdZy-e9eT9bkJFav8MVU&s=W42AQE74bvmOlw7P0D0wTqT0Rxop4KktnXeuDeGGdmk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment-0001.gif>

From rohwedder at de.ibm.com  Tue Sep 19 08:42:46 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Tue, 19 Sep 2017 09:42:46 +0200
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
	<OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
Message-ID: <OFB8D4EC66.B50F54BE-ON002581A0.002805B4-C12581A0.002A5E66@notes.na.collabserv.com>

Hello Neil,

While the description below provides a way on how to edit the hostname
parameter, you should not have the need to edit the "hostname" parameter.
Sensors use the hostname() call to get the hostname where the sensor is
running and use this as key in the performance database, which is what you
typically want to see.

From the description you provide I assume you want to have a sensor running
on every node that has the perfmon designation?
There could be different issues:
> In order to enable sensors on every node, you need to ensure there is no
"restrict" clause in the sensor description, or the restrict clause has to
be set correctly
> There could be some other communication issue between sensors and
collectors.
   Restart sensors and collectors and check the  logfiles
in /var/log/zimon/. You should be able to see which sensors start up and if
they can connect.
> Can you check if you have the perfmon designation set for the nodes where
you expect data from (mmlscluster)

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	"IBM Spectrum Scale" <scale at us.ibm.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc:	gpfsug-discuss-bounces at spectrumscale.org
Date:	09/19/2017 08:48 AM
Subject:	Re: [gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.

Inactive hide details for "Wilson, Neil" ---09/15/2017 06:48:26 PM---Hi,
Does anyone know how to use "mmperfmon config update" "Wilson, Neil"
---09/15/2017 06:48:26 PM---Hi, Does anyone know how to use "mmperfmon
config update" to get the "hostname =" field in the ZImon

From: "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 09/15/2017 06:48 PM
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson Senior IT Practitioner
Storage Team IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=Ow2bpnoab1kboH2xuSUrbx65ALeoAAicG7csl1sV-Qc&s=qZ1XUXWfOayLSSuvcCyHQ2ZgY1mu0Zs3kmpgeVQUCYI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0003.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1D696444.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0005.gif>

From mnaineni at in.ibm.com  Tue Sep 19 12:50:50 2017
From: mnaineni at in.ibm.com (Malahal R Naineni)
Date: Tue, 19 Sep 2017 11:50:50 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
	(Greg.Lehmann@csiro.au)
Message-ID: <OF60FA9DF0.C260AD81-ON002581A0.0040CA6E-002581A0.00411458@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f1d587f4/attachment-0001.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Tue Sep 19 22:02:03 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Tue, 19 Sep 2017 21:02:03 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/c6789dd1/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Wed Sep 20 00:39:37 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 19 Sep 2017 23:39:37 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>

OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f4a15d88/attachment-0001.htm>

From bevans at pixitmedia.com  Wed Sep 20 02:21:36 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Tue, 19 Sep 2017 18:21:36 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
Message-ID: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>

Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced from
1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2 *
nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
with GID c081f9feff078a26. Please check if the correct inet6 address for
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same
way and no dice. Have tried a variety of drivers, cards, etc, same result
every time.

Cheers,
Barry

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/bc9116c8/attachment-0001.htm>

From scale at us.ibm.com  Wed Sep 20 04:07:18 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:07:18 +0800
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
Message-ID: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>

Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot
be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:


1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/20/2017 07:39 AM
Subject:	Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the
cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down,
bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh,
Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while
back.  I did nothing about it at the time because it was due to be
life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.
I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.
I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes
files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from
testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above,
networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine
cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and
Education
Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment-0001.gif>

From scale at us.ibm.com  Wed Sep 20 04:33:16 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:33:16 +0800
Subject: [gpfsug-discuss] Disk change problem in gss GNR
In-Reply-To: <op.y6j29hzdpgw25x@pc-atm>
References: <op.y6j29hzdpgw25x@pc-atm>
Message-ID: <OF81362FD2.A45C682D-ON482581A1.0012A91B-482581A1.0013867A@notes.na.collabserv.com>


Hi Atmane,

In terms of this kind of disk management question, I would like to suggest
to open a PMR to make IBM service help you.

mmdelpdisk command would not need to reboot system to take effect.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	atmane <kh.atmane at gmail.com>
To:	"gpfsug-discuss at spectrumscale.org"
            <gpfsug-discuss at spectrumscale.org>
Date:	09/14/2017 08:50 PM
Subject:	[gpfsug-discuss] Disk change problem in gss GNR
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


--
--
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :
a.khiredine at meteo.dz
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFbA&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=hQ86ctTaI7i14NrB-58_SzqSWnCR8p6b5bFxtzNcSbk&s=mthjH7ebhnNlSJl71hFjF4wZU0iygm3I9wH_Bu7_3Ds&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment-0001.gif>

From olaf.weiser at de.ibm.com  Wed Sep 20 06:00:49 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 20 Sep 2017 07:00:49 +0200
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
Message-ID: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/c94a9646/attachment-0001.htm>

From jonathon.anderson at colorado.edu  Wed Sep 20 06:13:13 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:13:13 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
Message-ID: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From jonathon.anderson at colorado.edu  Wed Sep 20 06:33:14 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:33:14 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>,
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <BN3PR03MB1382D3400D91A1B2448E920480610@BN3PR03MB1382.namprd03.prod.outlook.com>

I should have said, here are the package versions:


[root at sgate1 ~]# rpm -qa | grep gpfs
gpfs.gpl-4.2.2-3.noarch
gpfs.docs-4.2.2-3.noarch
gpfs.base-4.2.2-3.x86_64
gpfs.gplbin-3.10.0-514.26.2.el7.x86_64-4.2.2-3.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm32_2.el7.x86_64
gpfs.ext-4.2.2-3.x86_64
gpfs.msg.en_US-4.2.2-3.noarch
gpfs.gskit-8.0.50-57.x86_64
gpfs.gplbin-3.10.0-327.36.3.el7.x86_64-4.2.2-3.x86_64


________________________________________
From: Jonathon A Anderson
Sent: Tuesday, September 19, 2017 11:13:13 PM
To: gpfsug main discussion list
Cc: varun.mittal at in.ibm.com; Mark.Bush at siriuscom.com
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From gangqiu at cn.ibm.com  Wed Sep 20 06:58:15 2017
From: gangqiu at cn.ibm.com (Gang Qiu)
Date: Wed, 20 Sep 2017 13:58:15 +0800
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
Message-ID: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>

 Do you set ip address for these adapters?

Refer to the description of verbsRdmaCm in ?Command and Programming 
Reference':

If RDMA CM is enabled for a node, the node will only be able to establish 
RDMA connections
using RDMA CM to other nodes with verbsRdmaCm enabled. RDMA CM enablement 
requires
IPoIB (IP over InfiniBand) with an active IP address for each port. 
Although IPv6 must be
enabled, the GPFS implementation of RDMA CM does not currently support 
IPv6 addresses, so
an IPv4 address must be used.


Regards,
Gang Qiu

********************************************************************************************** 

IBM China Systems & Technology Lab
Tel:   86-10-82452193
Fax:   86-10-82452312
Moble: 132-6134-8284
Email:  gangqiu at cn.ibm.com
Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8 
Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193, 
P.R.China
??????????????8???????28???????????100193
**********************************************************************************************


From:   "Olaf Weiser" <olaf.weiser at de.ibm.com>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/20/2017 01:01 PM
Subject:        Re: [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


is ib_read_bw  working  ?
just test it between the two nodes ... 


From:        Barry Evans <bevans at pixitmedia.com>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        09/20/2017 03:21 AM
Subject:        [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is 
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes 
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes 
verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version 
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced 
from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 
2 * nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort 
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0 
with GID c081f9feff078a26. Please check if the correct inet6 address for 
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so 
unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid 
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same 
way and no dice. Have tried a variety of drivers, cards, etc, same result 
every time.

Cheers,
Barry


This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email._______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/54161a6e/attachment-0001.htm>

From tortay at cc.in2p3.fr  Wed Sep 20 09:03:54 2017
From: tortay at cc.in2p3.fr (Loic Tortay)
Date: Wed, 20 Sep 2017 10:03:54 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
References: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
Message-ID: <853ffcf7-7900-457b-0d8a-2c63886ed245@cc.in2p3.fr>

On 19/09/2017 23:02, Buterbaugh, Kevin L wrote:
> Hi All,
> 
> We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?
> 
Hello,
I have had the same issue multiple times.

The "trick" is to execute "/usr/lpp/mmfs/bin/mmcommon startCcrMonitor"
on a majority of quorum nodes (once they have the correct configuration
files) to be able to start the cluster.

I noticed a call to the above command in the "gpfs.gplbin" spec file in
the "%postun" section (when doing RPM upgrades, if I'm not mistaken).

<Insert here rant about CCR design & testing>.


Lo?c.
-- 
|   Lo?c Tortay <tortay at cc.in2p3.fr>  -     IN2P3 Computing Centre     |


From r.sobey at imperial.ac.uk  Wed Sep 20 09:23:37 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 20 Sep 2017 08:23:37 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for 
authentication. In that case the SMB service includes two aspects: One is 
the actual SMB file server, and the second one is the service for the 
Active Directory integration. Since NFS depends on authentication and id 
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From douglasof at us.ibm.com  Wed Sep 20 09:28:44 2017
From: douglasof at us.ibm.com (Douglas O'flaherty)
Date: Wed, 20 Sep 2017 08:28:44 +0000
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
Message-ID: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>

Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
 
   
    For more information 
    http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/
    
    Doug
  
  Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/2244bfe7/attachment-0001.htm>

From ckrafft at de.ibm.com  Wed Sep 20 11:47:35 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Wed, 20 Sep 2017 12:47:35 +0200
Subject: [gpfsug-discuss] WANTED: Official support statement using Spectrum
 Scale 4.2.x with Oracle DB v12
Message-ID: <OF5272FF61.18327082-ON002581A1.003B0785-C12581A1.003B49D6@notes.na.collabserv.com>


Hi folks,

is anyone aware if there is now an official support statement for Spectrum
Scale 4.2.x?

As far as my understanding goes - we currently have an "older" official
support statement for v4.1 with Oracle.

Many thanks up-front for any useful hints ... :)


Mit freundlichen Gr??en / Sincerely

Christoph Krafft

Client Technical Specialist - Power Systems, IBM Systems
Certified IT Specialist @ The Open Group
                                                                                                               
                                                                                                               
 Phone:            +49 (0) 7034 643 2171                     IBM Deutschland GmbH                              
                                                                                                               
 Mobile:           +49 (0) 160 97 81 86 12                   Am Weiher 24                                      
                                                                                                               
 Email:            ckrafft at de.ibm.com                        65451 Kelsterbach                                 
                                                                                                               
                                                             Germany                                           
                                                                                                               
                                                                                                               
 IBM Deutschland                                                                                               
 GmbH /                                                                                                        
 Vorsitzender des                                                                                              
 Aufsichtsrats:                                                                                                
 Martin Jetter                                                                                                 
 Gesch?ftsf?hrung:                                                                                             
 Martina Koederitz                                                                                             
 (Vorsitzende),                                                                                                
 Norbert Janzen,                                                                                               
 Stefan Lutz,                                                                                                  
 Nicole Reimer,                                                                                                
 Dr. Klaus                                                                                                     
 Seifert, Wolfgang                                                                                             
 Wendt                                                                                                         
 Sitz der                                                                                                      
 Gesellschaft:                                                                                                 
 Ehningen /                                                                                                    
 Registergericht:                                                                                              
 Amtsgericht                                                                                                   
 Stuttgart, HRB                                                                                                
 14562 /                                                                                                       
 WEEE-Reg.-Nr. DE                                                                                              
 99369940                                                                                                      
                                                                                                               

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 15225079.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0003.gif>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 14:55:28 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 13:55:28 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
Message-ID: <FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD respectively).  Given that they were 12 year old boxes, we decided to replace them with other boxes that are a mere 7 years old ? keep in mind that this is a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.
mmchcluster: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that were removed from the cluster without removing them from the cluster (done by another SysAdmin who was in a hurry to repurpose those machines).  They?re up and pingable but not reachable by GPFS anymore, which I?m pretty sure is making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale <scale at us.ibm.com<mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node died here. Are you mean that there are some hardware/OS issue which cannot be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries.

The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and

From: "Oesterlin, Robert" <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM
Subject: Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while back. I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3. In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/754a6e51/attachment-0001.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:17:34 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:17:34 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
	<OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
Message-ID: <CAE6+Ly5VppqizgNwC+niq1uwKJNhC9bSJMEwt9MqW28mYfqjEg@mail.gmail.com>

Yep, IP's set ok. We did try with ipv6 off to see what would happen, then
turned it back on again. There are ipv6 addresses on the cards, but ipv4 is
the only thing actually being used.


On Tue, Sep 19, 2017 at 10:58 PM, Gang Qiu <gangqiu at cn.ibm.com> wrote:

>
>
>
> Do you set ip address for these adapters?
>
> Refer to the description of verbsRdmaCm in ?Command and Programming
> Reference':
>
> If RDMA CM is enabled for a node, the node will only be able to establish
> RDMA connections
> using RDMA CM to other nodes with *verbsRdmaCm *enabled. RDMA CM
> enablement requires
> IPoIB (IP over InfiniBand) with an active IP address for each port.
> Although IPv6 must be
> enabled, the GPFS implementation of RDMA CM does not currently support
> IPv6 addresses, so
> an IPv4 address must be used.
>
>
>
> Regards,
> Gang Qiu
>
> ************************************************************
> **********************************
> IBM China Systems & Technology Lab
> Tel:   86-10-82452193
> Fax:   86-10-82452312
> Moble: 132-6134-8284
> Email:  gangqiu at cn.ibm.com
> Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8
> Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193,
> P.R.China
> ??????????????8???????28???????????100193
> ************************************************************
> **********************************
>
>
>
> From:        "Olaf Weiser" <olaf.weiser at de.ibm.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 01:01 PM
> Subject:        Re: [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
>
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__pixitmedia.com_&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=Cuqio6URV5SlrAbObWAcbPH081odzTfHQKwjrXCoG60&e=>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.
> org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=
> NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=
> u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=
> 63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/8962053d/attachment-0001.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:23:21 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:23:21 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
Message-ID: <CAE6+Ly69-NAHycT0+7YkMjcVB1hTRqO-gtMvn0FdUa7HpFfCuQ@mail.gmail.com>

It has worked, yes, and while the issue has been present. At the moment
it's not working, but I'm not entirely surprised with the amount it's been
poked at.

Cheers,
Barry

On Tue, Sep 19, 2017 at 10:00 PM, Olaf Weiser <olaf.weiser at de.ibm.com>
wrote:

> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
> <http://pixitmedia.com/>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/7e91a2b9/attachment-0001.htm>

From kkr at lbl.gov  Wed Sep 20 17:00:15 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Wed, 20 Sep 2017 09:00:15 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
Message-ID: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>

Thanks Doug. 

If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.

Cheers,
Kristy

> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com> wrote:
> 
> 
> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
> 
> 
> For more information 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> Doug
> 
> Mobile
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/0f73131c/attachment-0001.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 17:27:48 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 16:27:48 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920114844.6bf9f27b@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
Message-ID: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
testdellnode1:  total 12
testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testgateway:  total 12
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
testnsd3:  total 8
testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12
/var/mmfs/gen
root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
/var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
remote shell process had return code 255. testnsd1.vampire:  Host key
verification failed. mmdsh: testnsd1.vampire remote shell process had return
code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/3a626e67/attachment-0001.htm>

From stijn.deweirdt at ugent.be  Wed Sep 20 18:48:26 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 20 Sep 2017 19:48:26 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <1f0b2657-8ca3-7b35-95f3-7c4edb6c0818@ugent.be>

hi kevin,

we were hit by similar issue when we did something not so smart: we had
a 5 node quorum, and we wanted to replace 1 test node with 3 more
production quorum node. we however first removed the test node, and then
with 4 quorum nodes we did mmshutdown for some other config
modifications. when we tried to start it, we hit the same "Not enough
CCR quorum nodes available" errors.

also, none of the ccr commands were helpful; they also hanged, even
simple ones like show etc etc.

what we did in the end was the following (and some try-and-error):

from the /var/adm/ras/mmsdrserv.log logfiles we guessed that we had some
sort of split brain paxos cluster (some reported " ccrd: recovery
complete (rc 809)", some same message with 'rc 0' and some didn't have
the recovery complete on the last line(s))

* stop ccr everywhere
mmshutdown -a
mmdsh -N all pkill -9 -f mmccr

* one by one, start the paxos cluster using mmshutdown on the quorum
nodes (mmshutdown will start ccr and there is no unit or something to
help with that).
 * the nodes will join after 3-4 minutes and report "recovery complete";
wait for it before you start another one

* the trial-and-error part was that sometimes there was recovery
complete with rc=809, sometimes with rc=0. in the end, once they all had
same rc=0, paxos was happy again and eg mmlsconfig worked again.


this left a very bad experience with CCR with us, but we want to use
ces, so no real alternative (and to be honest, with odd number of
quorum, we saw no more issues, everyting was smooth).

in particular we were missing
* unit files for all extra services that gpfs launched (mmccrmoniotr,
mmsysmon); so we can monitor and start/stop them cleanly
* ccr commands that work with broken paxos setup; eg to report that the
paxos cluster is broken or operating in some split-brain mode.

anyway, YMMV and good luck.

stijn


On 09/20/2017 06:27 PM, Buterbaugh, Kevin L wrote:
> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
> testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
> testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
> testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
> testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
> testdellnode1:  total 12
> testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testgateway:  total 12
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
> testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
> testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
> testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
> testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
> testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
> testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
> testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
> testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
> testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
> testnsd3:  total 8
> testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12
> /var/mmfs/gen
> root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
> testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
> remote shell process had return code 255. testnsd1.vampire:  Host key
> verification failed. mmdsh: testnsd1.vampire remote shell process had return
> code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
> main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

From jonathon.anderson at colorado.edu  Wed Sep 20 19:55:04 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 18:55:04 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


From ewahl at osc.edu  Wed Sep 20 20:07:39 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Wed, 20 Sep 2017 15:07:39 -0400
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <20170920150739.39f0a4a0@osc.edu>


So who was the ccrmaster before? 
What is/was the quorum config?  (tiebreaker disks?) 

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu> wrote:

> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after
> Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
> including some output below which seems to show that I?ve got everything set
> up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this
> experience doesn?t make me eager to do so!), so I?m not that familiar with
> it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
> grep" | sort testdellnode1:  root      2583     1  0 May30 ?
> 00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?
> 00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root     19356  4628  0 11:19 tty1
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root      4628     1  0 Sep19 tty1
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root     22149  2983  0 11:16 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root      2983     1  0 Sep18 ?
> 00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root     15685  6557  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root      6557     1  0 Sep19 ?
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?
> 00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
> 15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
> quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
> fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
> error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
> -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
> drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
> root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
> root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
> drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
> 2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
> 99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
> root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
> committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
> -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
> drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
> root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
> 4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
> 2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
> -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
> root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
> 2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
> Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
> cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
> 17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames
> "md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl
> <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: testnsd3.vampire:  Host key verification failed. mmdsh:
> testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
> Host key verification failed. mmdsh: testnsd1.vampire remote shell process
> had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
> To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
> - (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> 


-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From tarak.patel at canada.ca  Wed Sep 20 21:23:00 2017
From: tarak.patel at canada.ca (Patel, Tarak (SSC/SPC))
Date: Wed, 20 Sep 2017 20:23:00 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <mailman.5.1653257851.13686.gpfsug-discuss_gpfsug.org@gpfsug.org>

Hi,

Recently we deployed 3 sets of CES nodes where we are using LDAP for authentication service. We had to create a user in ldap which was used by 'mmuserauth service create' command.  Note that SMB needs to be disabled ('mmces service disable smb') if not being used before issuing 'mmuserauth service create'.  By default, CES deployment enables SMB (' spectrumscale config protocols').

Tarak

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September, 2017 14:55
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB service not enabled.
I am currently working on a spectrum scale cluster, and i dont have the SMB package, I am waiting for it.. is there a way to export NFSv3 using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works fine, I was able to mount the gpfs export from another machine.. this was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and 
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "* 
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but not 
> for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the 
> NFS client tells you".  This of course only works sanely if each NFS 
> export is only to a set of machines in the same administrative domain 
> that manages their UID/GIDs.  Exporting to two sets of machines that 
> don't coordinate their UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpi
> Bv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiy
> liSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ
> 0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGV
> srSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwC
> YeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbj
> XI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuv
> EeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discus
> s
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From chetkulk at in.ibm.com  Thu Sep 21 06:33:53 2017
From: chetkulk at in.ibm.com (Chetan R Kulkarni)
Date: Thu, 21 Sep 2017 11:03:53 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu><CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com><OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com><BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <OF1A4EBB73.EFDC10D4-ON652581A2.001E1BDB-652581A2.001E91B1@notes.na.collabserv.com>


Hi Jonathon,

I can configure file userdefined authentication with only NFS
enabled/running on my test setup (SMB was disabled).

Please check if following steps help fix your issue:

1> remove existing file auth if any
/usr/lpp/mmfs/bin/mmuserauth service remove --data-access-method file

2> disable smb service
/usr/lpp/mmfs/bin/mmces service disable smb
/usr/lpp/mmfs/bin/mmces service list -a

3> configure userdefined file auth
/usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined

4> if above fails retry mmuserauth in debug mode as below and please share
error log /tmp/userdefined.log. Also share spectrum scale version you are
running with.
export DEBUG=1; /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined > /tmp/userdefined.log 2>&1;
unset DEBUG
/usr/lpp/mmfs/bin/mmdiag --version

5> if mmuserauth succeeds in step 3> above; you also need to correct your
mmnfs cli command as below. You missed to type in Access_Type= and Squash=
in client definition.
mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash);dtn*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash)'

Thanks,
Chetan.


From:	Jonathon A Anderson <jonathon.anderson at colorado.edu>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/21/2017 12:25 AM
Subject:	Re: [gpfsug-discuss] export nfs share on gpfs with no
            authentication
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3
<varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did
you try to run this command from a CES designated nod]Varun
Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from
a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi,
While trying to add the userdefined auth, I receive err]Ilan Schwarts
---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I
receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export
is
> only to a set of machines in the same administrative domain that manages
their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=AliY037R_W1y8Ym6nPI1XDP2yCq47JwtTPhj9IppwOM&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment-0001.gif>

From andreas.mattsson at maxiv.lu.se  Thu Sep 21 13:09:29 2017
From: andreas.mattsson at maxiv.lu.se (Andreas Mattsson)
Date: Thu, 21 Sep 2017 12:09:29 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
References: <accef64e0cde48968aeca7cb9883112a@maxiv.lu.se>
	<EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org>
	<bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>

Since I solved this old issue a long time ago, I'd thought I'd come back and report the solution in case someone else encounters similar problems in the future.


Original problem reported by users:

Copying files between folders on NFS exports from a CES server gave random timestamps on the files.  Also, apart from the initial reported problem, there where issues where users sometimes couldn't change or delete files that they where owners of.


Background:

We have a Active Directory with RFC2307 posix attributes populated, and use the built in Winbind-based AD authentication with RFC2307 ID mapping of our Spectrum Scale CES protocol servers.

All our Linux clients and servers are also AD integrated, using Nslcd and nss-pam-ldapd.


Trigger:

If a user was part of a AD group with a mixed case name, and this group gave access to a folder, and the NFS mount was done using NFSv4, the behavior in my original post occurred when copying or changing files in that folder.


Cause:

Active Directory handle LDAP-requests case insensitive, but results are returned with case retained.

Winbind and SSSD-AD converts groups and usernames to lower case. Nslcd retains case.

We run NFS with managed GIDs. Managed GIDs in NFSv3 seems to be handled case insensitive, or to ignore the actual group name after it has resolved the GID-number of the group, while NFSv4 seems to handle group names case sensitive and check the actual group name for certain operations even if the GID-number matches.

Don't fully understand the mechanism behind why certain file operations would work but others not, but in essence a user would be part of a group called "UserGroup" with GID-number 1234 in AD and on the client, but would be part of a group called "usergroup" with GID-number 1234 on the CES server.

Any operation that's authorized on the GID-number, or a case insensitive lookup of the group name, would work. Any operation authorized by a case sensitive group lookup would fail.


Three different workarounds where found to work:

1. Rename groups and users to lower case in AD

2. Change from Nslcd to either SSSD or Winbind on the clients

3. Change from NFSv4 to NFSv3 when mounting NFS


Remember to clear ID-mapping caches.


Regards,

Andreas

___________________________________
[https://mail.google.com/mail/u/0/?ui=2&ik=b0a6f02971&view=att&th=14618fab2daf0e10&attid=0.1.1&disp=emb&zw&atsh=1]<https://www.maxlab.lu.se>

Andreas Mattsson
System Engineer

MAX IV Laboratory
Lund University
Tel: +46-706-649544<tel:%2B46-706-649544>
E-mail: andreas.mattsson at maxlab.lu.se<mailto:andreas.mattsson at maxlab.lu.se>
<mailto:daniel.liikamaa at maxlab.lu.se>
________________________________
Fr?n: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> f?r Stephen Ulmer <ulmer at ulmer.org>
Skickat: den 3 februari 2017 14:35:21
Till: gpfsug main discussion list
?mne: Re: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES

Does the cp actually complete? As in, does it copy all of the blocks?  What?s the exit code?

A cp?d file should have  ?new? metadata. That is, it should have it?s own dates, owners, etc. (not necessarily copied from the source file).

I ran ?strace cp foo1 foo2?, and it was pretty instructive, maybe that would get you more info. On CentOS strace is in it?s own package, YMMV.

--
Stephen


On Feb 3, 2017, at 8:19 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

That works.

?touch test100?

Feb 3 14:16 test100

?cp test100 test101?

Feb 3 14:16 test100
Apr 21 2027 test101

?touch ?r test100 test101?

Feb 3 14:16 test100
Feb 3 14:16 test101

/Andreas


That?s a cool one. :)

What if you use the "random date" file as a time reference to touch another file (like, 'touch -r file02 file03?)?

--
Stephen


On Feb 3, 2017, at 7:46 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

I?m having some really strange timestamp behaviour when doing file operations on NFS mounts shared via CES on spectrum scale 4.2.1.1
The NFS clients are up to date Centos and Debian machines.
All Scale servers and NFS clients have correct date and time via NTP.

Creating a file, for instance ?touch file00?, gives correct timestamp.
Moving the file, ?mv file00 file01?, gives correct timestamp
Copying the file, ?cp file01 file02?, gives a random timestamp anywhere in time, for instance Oct 12 2095 or Feb 29 1976 or something similar.

This is only via NFS. Copying the file via a native gpfs-mount or via SMB gives a correct timestamp.
Doing the same operation over NFS to other NFS-servers works correct, it is only when operating on the NFS-share from the Spectrum Scale CES the issue occurs.

Have anyone seen this before?

Regards,
Andreas Mattsson
_____________________________________________
<image001.png>

Andreas Mattsson
Systems Engineer

MAX IV Laboratory
Lund University
P.O. Box 118, SE-221 00 Lund, Sweden
Visiting address: Fotongatan 2, 225 94 Lund
Mobile: +46 706 64 95 44
andreas.mattsson at maxiv.se<mailto:andreas.mattsson at maxiv.se>
www.maxiv.se<http://www.maxiv.se/>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/fbe5e837/attachment-0001.htm>

From taylorm at us.ibm.com  Thu Sep 21 15:33:00 2017
From: taylorm at us.ibm.com (Michael L Taylor)
Date: Thu, 21 Sep 2017 07:33:00 -0700
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFC2F832AE.B5685939-ON002581A1.008268B6-072581A2.004FECF6@notes.na.collabserv.com>


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com:  NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/20/2017 12:07 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 42
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: export nfs share on gpfs with no authentication
      (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
		 authentication
Message-ID:

<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/4ab85c21/attachment-0001.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Thu Sep 21 18:09:52 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Thu, 21 Sep 2017 17:09:52 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920150739.39f0a4a0@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
	<20170920150739.39f0a4a0@osc.edu>
Message-ID: <A44350B7-4CEF-497A-9D41-0C1A96B0F103@vanderbilt.edu>

Hi All,

Ralf Eberhard of IBM helped me resolve this off list.  The key was to temporarily make testnsd1 and testnsd3 not be quorum nodes by making sure GPFS was down and then executing:

mmchnode --nonquorum -N testnsd1,testnsd3 --force

That gave me some scary messages about overriding normal GPFS quorum semantics, but nce that was done I was able to run an ?mmstartup -a? and bring up the cluster!  Once it was up and I had verified things were working properly I then shut it back down so that I could rerun the mmchnode (without the ?force) to make testnsd1 and testnsd3 quorum nodes again.

Thanks to all who helped me out here?

Kevin

On Sep 20, 2017, at 2:07 PM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:


So who was the ccrmaster before?
What is/was the quorum config?  (tiebreaker disks?)

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after
Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
including some output below which seems to show that I?ve got everything set
up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this
experience doesn?t make me eager to do so!), so I?m not that familiar with
it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
grep" | sort testdellnode1:  root      2583     1  0 May30 ?
00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?
00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root     19356  4628  0 11:19 tty1
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root      4628     1  0 Sep19 tty1
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root     22149  2983  0 11:16 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root      2983     1  0 Sep18 ?
00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root     15685  6557  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root      6557     1  0 Sep19 ?
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?
00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
-F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
-rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
-rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames
"md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl
<ewahl at osc.edu<mailto:ewahl at osc.edu><mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: testnsd3.vampire:  Host key verification failed. mmdsh:
testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
Host key verification failed. mmdsh: testnsd1.vampire remote shell process
had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
- (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=d0MIeC47FlVIyiWVgLm%2FmvIKWJYwHVR2Kp9oMAPrtgM%3D&reserved=0><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=Z59ik0w%2BaK6bV2JsDxSNt%2FsqwR1ESuqkXTQVBlRjDgw%3D&reserved=0


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/1109ba42/attachment-0001.htm>

From kkr at lbl.gov  Thu Sep 21 19:49:29 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Thu, 21 Sep 2017 11:49:29 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
	<D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
Message-ID: <CB28D3BC-65C6-43F7-B8BC-2E88E99A2573@lbl.gov>

Registration space is getting tight. We decided on a room reconfiguration today to make a little more room. So if you tried to register and were told it was full try again. If it fills up again and you want to register, but can?t drop me an email and I?ll see what we can do.

Best,
Kristy

> On Sep 20, 2017, at 9:00 AM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> Thanks Doug. 
> 
> If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.
> 
> Cheers,
> Kristy
> 
>> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com <mailto:douglasof at us.ibm.com>> wrote:
>> 
>> 
>> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
>> 
>> 
>> For more information 
>> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
>> 
>> Doug
>> 
>> Mobile
>> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/d1f7b641/attachment-0001.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:08:58 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:08:58 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>
References: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF8B6E6F9D.A72975C0-ON002581A3.0078D4A8-002581A3.0079AC00@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/8ea3693d/attachment-0001.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:10:45 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:10:45 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>
References: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>,
	<10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF3932E8C8.E1CBE95E-ON002581A3.0079CED5-002581A3.0079D5BB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/e1751905/attachment-0001.htm>

From bipcuds at gmail.com  Sun Sep 24 19:04:59 2017
From: bipcuds at gmail.com (Keith Ball)
Date: Sun, 24 Sep 2017 14:04:59 -0400
Subject: [gpfsug-discuss] Experience with zimon database stability,
	and best practices for backup?
Message-ID: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/b2d6a044/attachment-0001.htm>

From kkr at lbl.gov  Sun Sep 24 20:29:10 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Sun, 24 Sep 2017 12:29:10 -0700
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
In-Reply-To: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
References: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
Message-ID: <CAA9oNusSp5HDYDaCCWs5jVQv6M2m15kkMu8omekS0zc8nqwiTA@mail.gmail.com>

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up
against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I
have been thinking about. I'm interested partly because of the granularity
that is lost over time like with any round robin style data collection
scheme.

So I guess one question is whether you have considered pulling the data out
to another database, looked at the SS GUI which uses a postgres db (iirc,
about to take off on a flight and can't check), or looked at the Grafana
bridge which would get data into OpenTsdb format, again iirc. Anyway, just
some things for consideration and a request to share back whatever you find
out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com> wrote:

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/60dead5b/attachment-0001.htm>

From rkomandu at in.ibm.com  Mon Sep 25 06:26:15 2017
From: rkomandu at in.ibm.com (Ravi K Komanduri)
Date: Mon, 25 Sep 2017 10:56:15 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
	<OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
Message-ID: <OF93AA3932.35C926C1-ON652581A6.001C6917-652581A6.001DDE07@notes.na.collabserv.com>

Jonathon,

This requires SMB service when you are at 422 PTF2. As Mike pointed out if 
you upgrade to the 4.2.3-3/4 build you will no longer hit that issue 


With Regards,
Ravi K Komanduri
Email:rkomandu at in.ibm.com


From:   "Michael L Taylor" <taylorm at us.ibm.com>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/21/2017 08:03 PM
Subject:        Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest 
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com: NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file 
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From: gpfsug-discuss-request at spectrumscale.org
To: gpfsug-discuss at spectrumscale.org
Date: 09/20/2017 12:07 PM
Subject: gpfsug-discuss Digest, Vol 68, Issue 42
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e= 

or, via email, send a message with subject or body 'help' to
gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

  1. Re: export nfs share on gpfs with no authentication
     (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Message-ID:
<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>

Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined 
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org 
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A 
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've 
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A 
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a 
blocking bugfix applied. I'm making it further now, but I'm getting an 
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication 
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to 
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB 
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create 
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to 
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=ilYETqcaNr1y1ulWWDPjVg_X9pt35O1eYBTyFwJP56Y&m=VW8gJLSqT4rru6lFZXxCFp-Y3ngi6IUydv5czoG8kTE&s=deIQZQr-qfqLqW377yNysTJI8y7QJOdbokVjlnDr2d8&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/e2ed42ba/attachment-0001.htm>

From john.hearns at asml.com  Mon Sep 25 08:40:34 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 25 Sep 2017 07:40:34 +0000
Subject: [gpfsug-discuss] SPectrum Scale on AWS
Message-ID: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>

I guess this is not news on this list, however I did see a reference to SpectrumScale  on The Register this morning,
which linked to this paper:
https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf

The article is here https://www.theregister.co.uk/2017/09/25/storage_super_club_sandwich/
12 Terabyte Helium drives now available.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/2d252a2d/attachment-0001.htm>

From mikeowen at thinkboxsoftware.com  Mon Sep 25 10:26:21 2017
From: mikeowen at thinkboxsoftware.com (Mike Owen)
Date: Mon, 25 Sep 2017 10:26:21 +0100
Subject: [gpfsug-discuss] SPectrum Scale on AWS
In-Reply-To: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <CADFF-zeNCFgnyU3p8kEPeTYLEZyHOsav-2BuWi+J48Qmn3SavQ@mail.gmail.com>

Full PR release below:


https://aws.amazon.com/about-aws/whats-new/2017/09/deploy-ibm-spectrum-scale-on-the-aws-cloud-with-new-quick-start/


Posted On: Sep 13, 2017


This new Quick Start automatically deploys a highly available IBM Spectrum
Scale cluster with replication on the Amazon Web Services (AWS) Cloud, into
a configuration of your choice. (A small cluster can be deployed in about
25 minutes.)


IBM Spectrum Scale is a flexible, software-defined storage solution that
can be deployed as highly available, high-performance file storage. It can
scale in several dimensions, including performance (bandwidth and IOPS),
capacity, and number of nodes that can mount the file system. The product?s
high performance and scalability helps address the needs of applications
whose performance (or performance-to-capacity ratio) demands cannot be met
by traditional scale-up storage systems. The IBM Spectrum Scale software is
being made available through a 90-day trial license evaluation program.


This Quick Start automates the deployment of IBM Spectrum Scale on AWS for
users who require highly available access to a shared name space across
multiple instances with good performance, without requiring an in-depth
knowledge of IBM Spectrum Scale.


The Quick Start deploys IBM Network Shared Disk (NSD) storage server
instances and IBM Spectrum Scale compute instances into a virtual private
cloud (VPC) in your AWS account. Data and metadata elements are replicated
across two Availability Zones for optimal data protection. You can build a
new VPC for IBM Spectrum Scale, or deploy the software into your existing
VPC. The automated deployment provisions the IBM Spectrum Scale instances
in Auto Scaling groups for instance scaling and management.


The deployment and configuration tasks are automated by AWS CloudFormation
templates that you can customize during launch. You can also use the
templates as a starting point for your own implementation, by downloading
them from the GitHub repository
<https://github.com/aws-quickstart/quickstart-ibm-spectrum-scale>. The
Quick Start includes a guide with step-by-step deployment and configuration
instructions.


To get started with IBM Spectrum Scale on AWS, use the following resources:

   - View the architecture and details
   <https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale/>
   - View the deployment guide
   <https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf>
   - Browse and launch other AWS Quick Start reference deployments
   <https://aws.amazon.com/quickstart/>


On 25 September 2017 at 08:40, John Hearns <john.hearns at asml.com> wrote:

> I guess this is not news on this list, however I did see a reference to
> SpectrumScale  on The Register this morning,
>
> which linked to this paper:
>
> https://s3.amazonaws.com/quickstart-reference/ibm/
> spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf
>
>
>
> The article is here https://www.theregister.co.uk/
> 2017/09/25/storage_super_club_sandwich/
>
> 12 Terabyte Helium drives now available.
>
>
>
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. Neither the sender nor the
> company/group of companies he or she represents shall be liable for the
> proper and complete transmission of the information contained in this
> communication, or for any delay in its receipt.
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/6f7899e7/attachment-0001.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 25 12:42:15 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 25 Sep 2017 11:42:15 +0000
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
Message-ID: <018DE6B7-ADE3-4A01-B23C-9DB668FD95DB@nuance.com>

Another data point for Keith/Kristy,

I?ve been using Zimon for about 18 months now, and I?ll have to admit it?s been less than robust for long-term data. The biggest issue I?ve run into is the stability of the collector process. I have it crash on a fairly regular basis, most due to memory usage. This results in data loss You can configure it in a highly-available mode that should mitigate this to some degree. However, I don?t think IBM has published any details on how reliable the data collection process is.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Kristy Kallback-Rose <kkr at lbl.gov>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Sunday, September 24, 2017 at 2:29 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] Experience with zimon database stability, and best practices for backup?

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I have been thinking about. I'm interested partly because of the granularity that is lost over time like with any round robin style data collection scheme.

So I guess one question is whether you have considered pulling the data out to another database, looked at the SS GUI which uses a postgres db (iirc, about to take off on a flight and can't check), or looked at the Grafana bridge which would get data into OpenTsdb format, again iirc. Anyway, just some things for consideration and a request to share back whatever you find out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com<mailto:bipcuds at gmail.com>> wrote:
Hello All,
In a recent Spectrum Scale performance study, we used zimon/mmperfmon to gather metrics. During a period of 2 months, we ended up losing data twice from the zimon database; once after the virtual disk serving both the OS files and zimon collector and DB storage was resized, and a second time after an unknown event (the loss was discovered when plotting in Grafana only went back to a certain data and time; likewise, mmperfmon query output only went back to the same time).
Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains" settings were as follows (note that we did not hit the ceiling of 60GB (1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10 seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10 seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}

Questions:
1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing the aggregation domain definitions, or even simply restarting the zimon collector?

3.) Does anyone have any "best practices" for backing up the zimon database? We were taking weekly "snapshots" by shutting down the collector, and making a tarball copy of the /opt/ibm/zimon directory (but the database corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the suspected dates of data loss; we do have a gpfs.snap from about a month after the last data loss - would it have any useful clues? Opening a PMR could be tricky, as it was the customer who has the support entitlement, and the environment (specifically the old cluster definitino and the zimon collector VM) was torn down.


Many Thanks,
  Keith

--
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.redlineperf.com_&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=dUvbBoiPFANvyGsOER5MAnt9-mwK69adFuLFatx2Rmw&e=>
email: kball at redlineperf.com<mailto:aqualkenbush at redlineperf.com>
cell: 540-557-7851<tel:%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=d6CkXN5mbyGvJQOduzX-LhJMANQgfvAV-nw_6ZgG-D4&e=>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=LkO3HEtokkzigjYqB4dIOUWLPhtikMbwcsXEakFp8DU&e=>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/27ae52b4/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Mon Sep 25 15:35:33 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 25 Sep 2017 14:35:33 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Message-ID: <1506350132.352.17.camel@imperial.ac.uk>

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL

[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/38f28726/attachment-0001.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment-0002.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment-0003.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment-0003.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment-0003.htm>

From christof.schmitt at us.ibm.com  Tue Sep 26 21:49:09 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Tue, 26 Sep 2017 20:49:09 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/b338de75/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Wed Sep 27 09:02:51 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 27 Sep 2017 08:02:51 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DB2FC64B573BD26AAAE1DF780@HE1PR0602MB3225.eurprd06.prod.outlook.com>

I?m sorry, you?re right. I can only assume my brain was looking for an SID entry so when I saw Everyone:ALLOWED/FULL it didn?t process it at all.

4.2.3-4:
[root at cesnode ~]# mmsmb exportacl list

[testces]
ACL:\Everyone:ALLOWED/FULL

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 26 September 2017 21:49
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

The default for the "export ACL" is always to allow access to "Everyone", so that the the "export ACL" does not limit access by default, but only the file system ACL. I do not have systems with these code levels at hand, could you show the difference you see between PTF2 and PTF4?

Regards,

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Tue, Sep 26, 2017 2:59 AM


There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.


Thanks

Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


Hi Christof,  thanks I?ll try it on a test cluster.


Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM


Hi all,


This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.


[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids


[neuroscience2]

REVISION:1

ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0

mmsmb exportacl remove: Incorrect option: --sid

Usage:

mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]

      where:

Access is one of ALLOWED, DENIED

      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO


I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.


Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.


[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL

[E] The mmsmb exportacl remove command requires a Name or SID.


Can anyone see my mistake?


Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=B-AqKIRCmLBzoWAhGn7NY-ZASOX25NuP_c_ndE8gy4A&s=S06OD3mbRedYjfwETO8tUnlOjnWT7pOX8nsYX5ebIdA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/b11f5615/attachment-0001.htm>

From kenneth.waegeman at ugent.be  Wed Sep 27 09:16:49 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 27 Sep 2017 10:16:49 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
Message-ID: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth


From michael.holliday at crick.ac.uk  Wed Sep 27 09:25:58 2017
From: michael.holliday at crick.ac.uk (Michael Holliday)
Date: Wed, 27 Sep 2017 08:25:58 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
Message-ID: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>

Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/26da6e56/attachment-0001.htm>

From bbanister at jumptrading.com  Wed Sep 27 14:59:08 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 27 Sep 2017 13:59:08 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
In-Reply-To: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
References: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
Message-ID: <c87527d09e0644c4958cd4df7f3598f4@jumptrading.com>

Actually you will get a benefit in that you can set up a callback so that users get alerted when they got over a soft quota.

We also set up a fileset quota so that the callback will automatically notify users when they exceed their block and file quotas for their fileset as well.

Hope that helps,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Michael Holliday
Sent: Wednesday, September 27, 2017 4:26 AM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] File Quotas vs Inode Limits

Note: External Email
________________________________
Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/e9847c88/attachment-0001.htm>

From Greg.Lehmann at csiro.au  Thu Sep 28 00:44:53 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Wed, 27 Sep 2017 23:44:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
Message-ID: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From bbanister at jumptrading.com  Thu Sep 28 14:21:34 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Thu, 28 Sep 2017 13:21:34 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
Message-ID: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From JRLang at uwyo.edu  Thu Sep 28 15:18:52 2017
From: JRLang at uwyo.edu (Jeffrey R. Lang)
Date: Thu, 28 Sep 2017 14:18:52 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
Message-ID: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>

I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.

cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
kdump-kern.o: In function `GetOffset':
kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
kdump-kern.o: In function `KernInit':
kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
collect2: error: ld returned 1 exit status
make[1]: *** [modules] Error 1
make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
make: *** [Modules] Error 1
--------------------------------------------------------
mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
--------------------------------------------------------
mmbuildgpl: Command failed. Examine previous error messages to determine cause.
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# uname -a
Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root at bkupsvr3 ~]# mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.2.3 ".
Built on Mar 16 2017 at 11:19:59

In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2

If I'm missing something can some one point me in the right direction?


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
Sent: Thursday, September 28, 2017 8:22 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From xhejtman at ics.muni.cz  Thu Sep 28 15:22:54 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 28 Sep 2017 16:22:54 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <20170928142254.xwjvp3qwnilazer7@ics.muni.cz>

You need 4.2.3.4 GPFS version and it will work.

On Thu, Sep 28, 2017 at 02:18:52PM +0000, Jeffrey R. Lang wrote:
> I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.
> 
> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
> kdump-kern.o: In function `GetOffset':
> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
> kdump-kern.o: In function `KernInit':
> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
> collect2: error: ld returned 1 exit status
> make[1]: *** [modules] Error 1
> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
> make: *** [Modules] Error 1
> --------------------------------------------------------
> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
> --------------------------------------------------------
> mmbuildgpl: Command failed. Examine previous error messages to determine cause.
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# uname -a
> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> [root at bkupsvr3 ~]# mmdiag --version
> 
> === mmdiag: version ===
> Current GPFS build: "4.2.2.3 ".
> Built on Mar 16 2017 at 11:19:59
> 
> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2
> 
> If I'm missing something can some one point me in the right direction?
> 
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
> Sent: Thursday, September 28, 2017 8:22 AM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Please review this site:
> 
> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
> 
> Hope that helps,
> -Bryan
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
> Sent: Wednesday, September 27, 2017 6:45 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Note: External Email
> -------------------------------------------------
> 
> I guess I may as well ask about SLES 12 SP3 as well! TIA.
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
> Sent: Wednesday, 27 September 2017 6:17 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: [gpfsug-discuss] el7.4 compatibility
> 
> Hi,
> 
> Is there already some information available of gpfs (and protocols) on
> el7.4 ?
> 
> Thanks!
> 
> Kenneth
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> ________________________________
> 
> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-- 
Luk?? Hejtm?nek


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:23:53 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:23:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>

The 7.4 kernels are listed as having been tested by IBM.

Having said that, we have clients running 7.4 kernel and its OK, but we
are 4.2.3.4efix2, so bump versions...

Simon

On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
JRLang at uwyo.edu> wrote:

>I just tired to build the GPFS GPL module against the latest version of
>RHEL 7.4 kernel and the build fails.  The link below show that it should
>work.
>
>cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>kdump-kern.o: In function `GetOffset':
>kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>kdump-kern.o: In function `KernInit':
>kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>collect2: error: ld returned 1 exit status
>make[1]: *** [modules] Error 1
>make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>make: *** [Modules] Error 1
>--------------------------------------------------------
>mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>--------------------------------------------------------
>mmbuildgpl: Command failed. Examine previous error messages to determine
>cause.
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]# uname -a
>Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>[root at bkupsvr3 ~]# mmdiag --version
>
>=== mmdiag: version ===
>Current GPFS build: "4.2.2.3 ".
>Built on Mar 16 2017 at 11:19:59
>
>In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>case 514.26.2
>
>If I'm missing something can some one point me in the right direction?
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>Banister
>Sent: Thursday, September 28, 2017 8:22 AM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Please review this site:
>
>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>
>Hope that helps,
>-Bryan
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>Greg.Lehmann at csiro.au
>Sent: Wednesday, September 27, 2017 6:45 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Note: External Email
>-------------------------------------------------
>
>I guess I may as well ask about SLES 12 SP3 as well! TIA.
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>Waegeman
>Sent: Wednesday, 27 September 2017 6:17 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: [gpfsug-discuss] el7.4 compatibility
>
>Hi,
>
>Is there already some information available of gpfs (and protocols) on
>el7.4 ?
>
>Thanks!
>
>Kenneth
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>________________________________
>
>Note: This email is for the confidential use of the named addressee(s)
>only and may contain proprietary, confidential or privileged information.
>If you are not the intended recipient, you are hereby notified that any
>review, dissemination or copying of this email is strictly prohibited,
>and to please notify the sender immediately and destroy this email and
>any attachments. Email transmission cannot be guaranteed to be secure or
>error-free. The Company, therefore, does not make any guarantees as to
>the completeness or accuracy of this email or any attachments. This email
>is for informational purposes only and does not constitute a
>recommendation, offer, request or solicitation of any kind to buy, sell,
>subscribe, redeem or perform any type of transaction of a financial
>product.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From kenneth.waegeman at ugent.be  Thu Sep 28 15:36:04 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Thu, 28 Sep 2017 16:36:04 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
Message-ID: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>


On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
> The 7.4 kernels are listed as having been tested by IBM.
Hi,

Were did you find this?
>
> Having said that, we have clients running 7.4 kernel and its OK, but we
> are 4.2.3.4efix2, so bump versions...
Do you have some information about the efix2? Is this for 7.4 ? And 
where should we find this :-)

Thank you!

Kenneth

>
> Simon
>
> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
> JRLang at uwyo.edu> wrote:
>
>> I just tired to build the GPFS GPL module against the latest version of
>> RHEL 7.4 kernel and the build fails.  The link below show that it should
>> work.
>>
>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>> kdump-kern.o: In function `GetOffset':
>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>> kdump-kern.o: In function `KernInit':
>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>> collect2: error: ld returned 1 exit status
>> make[1]: *** [modules] Error 1
>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>> make: *** [Modules] Error 1
>> --------------------------------------------------------
>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>> --------------------------------------------------------
>> mmbuildgpl: Command failed. Examine previous error messages to determine
>> cause.
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]# uname -a
>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>> [root at bkupsvr3 ~]# mmdiag --version
>>
>> === mmdiag: version ===
>> Current GPFS build: "4.2.2.3 ".
>> Built on Mar 16 2017 at 11:19:59
>>
>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>> case 514.26.2
>>
>> If I'm missing something can some one point me in the right direction?
>>
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>> Banister
>> Sent: Thursday, September 28, 2017 8:22 AM
>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Please review this site:
>>
>> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>>
>> Hope that helps,
>> -Bryan
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>> Greg.Lehmann at csiro.au
>> Sent: Wednesday, September 27, 2017 6:45 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Note: External Email
>> -------------------------------------------------
>>
>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>> Waegeman
>> Sent: Wednesday, 27 September 2017 6:17 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: [gpfsug-discuss] el7.4 compatibility
>>
>> Hi,
>>
>> Is there already some information available of gpfs (and protocols) on
>> el7.4 ?
>>
>> Thanks!
>>
>> Kenneth
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>> ________________________________
>>
>> Note: This email is for the confidential use of the named addressee(s)
>> only and may contain proprietary, confidential or privileged information.
>> If you are not the intended recipient, you are hereby notified that any
>> review, dissemination or copying of this email is strictly prohibited,
>> and to please notify the sender immediately and destroy this email and
>> any attachments. Email transmission cannot be guaranteed to be secure or
>> error-free. The Company, therefore, does not make any guarantees as to
>> the completeness or accuracy of this email or any attachments. This email
>> is for informational purposes only and does not constitute a
>> recommendation, offer, request or solicitation of any kind to buy, sell,
>> subscribe, redeem or perform any type of transaction of a financial
>> product.
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:45:25 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:45:25 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
Message-ID: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but we
>> are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf
>> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf
>>of
>> JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version of
>>> RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should
>>> work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine
>>> cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>>> case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>> 
>>>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.ht
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>>> Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named addressee(s)
>>> only and may contain proprietary, confidential or privileged
>>>information.
>>> If you are not the intended recipient, you are hereby notified that any
>>> review, dissemination or copying of this email is strictly prohibited,
>>> and to please notify the sender immediately and destroy this email and
>>> any attachments. Email transmission cannot be guaranteed to be secure
>>>or
>>> error-free. The Company, therefore, does not make any guarantees as to
>>> the completeness or accuracy of this email or any attachments. This
>>>email
>>> is for informational purposes only and does not constitute a
>>> recommendation, offer, request or solicitation of any kind to buy,
>>>sell,
>>> subscribe, redeem or perform any type of transaction of a financial
>>> product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


From aaron.s.knister at nasa.gov  Fri Sep 29 02:59:39 2017
From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP])
Date: Fri, 29 Sep 2017 01:59:39 +0000
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
Message-ID: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>

Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively, any word on a 4.2.3.5 release date?

-Aaron


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/36aa7795/attachment-0001.htm>

From john.hearns at asml.com  Fri Sep 29 10:02:26 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 29 Sep 2017 09:02:26 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
Message-ID: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols)
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that
>>>any  review, dissemination or copying of this email is strictly
>>>prohibited,  and to please notify the sender immediately and destroy
>>>this email and  any attachments. Email transmission cannot be
>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>not make any guarantees as to  the completeness or accuracy of this
>>>email or any attachments. This email  is for informational purposes
>>>only and does not constitute a  recommendation, offer, request or
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.


From r.sobey at imperial.ac.uk  Fri Sep 29 10:04:49 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 29 Sep 2017 09:04:49 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Efixes (in my one time only limited experience!) come direct from IBM as a result of a PMR. 
Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 29 September 2017 10:02
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but 
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And 
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on 
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version 
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it 
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to 
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat 
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In 
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan 
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) 
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or 
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that 
>>>any  review, dissemination or copying of this email is strictly 
>>>prohibited,  and to please notify the sender immediately and destroy 
>>>this email and  any attachments. Email transmission cannot be 
>>>guaranteed to be secure or  error-free. The Company, therefore, does 
>>>not make any guarantees as to  the completeness or accuracy of this 
>>>email or any attachments. This email  is for informational purposes 
>>>only and does not constitute a  recommendation, offer, request or 
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform 
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Fri Sep 29 10:39:43 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Fri, 29 Sep 2017 09:39:43 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <D5F3D1A8.61653%s.j.thompson@bham.ac.uk>

Correct they some from IBM support.

The AFM issue we have (and is fixed in the efix) is if you have client
code running on the AFM cache that uses truncate. The AFM write coalescing
processing does something funny with it, so the file isn't truncated and
then the data you write afterwards isn't copied back to home.

We found this with ABAQUS code running on our HPC nodes onto the AFM
cache, I.e. At home, the final packed output file from ABAQUS is corrupt
as its the "untruncated and then filled" version of the file (so just a
big blob of empty data). I would guess that anything using truncate would
see the same issue.

4.2.3.x: APAR IV99796

See IBM Flash Alert at:
http://www-01.ibm.com/support/docview.wss?uid=ssg1S1010629&myns=s033&mynp=O
CSTXKQY&mynp=OCSWJ00&mync=E&cm_sp=s033-_-OCSTXKQY-OCSWJ00-_-E


Its remedied in efix2, of course remember that an efix has not gone
through a full testing validation cycle (otherwise it would be a PTF), but
we have not seen any issues in our environments running 4.2.3.4efix2.

Simon

On 29/09/2017, 10:04, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Sobey, Richard A" <gpfsug-discuss-bounces at spectrumscale.org on behalf
of r.sobey at imperial.ac.uk> wrote:

>Efixes (in my one time only limited experience!) come direct from IBM as
>a result of a PMR.
>Richard
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
>Sent: 29 September 2017 10:02
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Simon,
>I would appreciate a heads up on that AFM issue.
>I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is
>if a remote NFS mount goes down then an asynchronous operation such as a
>read can be stopped.
>
>I must admit to being not clued up on how the efixes are distributed. I
>downloaded the 4.2.3.4 installer for Linux yesterday.
>Should I be searching for additional fix packs on top of that (which I am
>in fact doing now).
>
>John H
>
>
>
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon
>Thompson (IT Research Support)
>Sent: Thursday, September 28, 2017 4:45 PM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>
>Aren't listed as tested
>
>Sorry ...
>4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
>issue we have.
>
>Simon
>
>On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
><kenneth.waegeman at ugent.be> wrote:
>
>>
>>
>>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>>> The 7.4 kernels are listed as having been tested by IBM.
>>Hi,
>>
>>Were did you find this?
>>>
>>> Having said that, we have clients running 7.4 kernel and its OK, but
>>> we are 4.2.3.4efix2, so bump versions...
>>Do you have some information about the efix2? Is this for 7.4 ? And
>>where should we find this :-)
>>
>>Thank you!
>>
>>Kenneth
>>
>>>
>>> Simon
>>>
>>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>>on behalf of  JRLang at uwyo.edu> wrote:
>>>
>>>> I just tired to build the GPFS GPL module against the latest version
>>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>>should  work.
>>>>
>>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>>> kdump-kern.o: In function `GetOffset':
>>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>>> kdump-kern.o: In function `KernInit':
>>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>>> collect2: error: ld returned 1 exit status
>>>> make[1]: *** [modules] Error 1
>>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>>> make: *** [Modules] Error 1
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT
>>>>2017.
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>>determine  cause.
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]# uname -a
>>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>>Sep 9
>>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>> [root at bkupsvr3 ~]# mmdiag --version
>>>>
>>>> === mmdiag: version ===
>>>> Current GPFS build: "4.2.2.3 ".
>>>> Built on Mar 16 2017 at 11:19:59
>>>>
>>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>>> my case 514.26.2
>>>>
>>>> If I'm missing something can some one point me in the right direction?
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>>> Banister
>>>> Sent: Thursday, September 28, 2017 8:22 AM
>>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Please review this site:
>>>>
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>>ml
>>>>
>>>> Hope that helps,
>>>> -Bryan
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Greg.Lehmann at csiro.au
>>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Note: External Email
>>>> -------------------------------------------------
>>>>
>>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Kenneth Waegeman
>>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Hi,
>>>>
>>>> Is there already some information available of gpfs (and protocols)
>>>> on
>>>> el7.4 ?
>>>>
>>>> Thanks!
>>>>
>>>> Kenneth
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0 _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> Note: This email is for the confidential use of the named
>>>>addressee(s)  only and may contain proprietary, confidential or
>>>>privileged information.
>>>> If you are not the intended recipient, you are hereby notified that
>>>>any  review, dissemination or copying of this email is strictly
>>>>prohibited,  and to please notify the sender immediately and destroy
>>>>this email and  any attachments. Email transmission cannot be
>>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>>not make any guarantees as to  the completeness or accuracy of this
>>>>email or any attachments. This email  is for informational purposes
>>>>only and does not constitute a  recommendation, offer, request or
>>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>>any type of transaction of a financial  product.
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0  _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>> pw%3D&reserved=0
>>
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.o
>rg%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml
>.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc
>%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
>-- The information contained in this communication and any attachments is
>confidential and may be privileged, and is for the sole use of the
>intended recipient(s). Any unauthorized review, use, disclosure or
>distribution is prohibited. Unless explicitly stated otherwise in the
>body of this communication or the attachment thereto (if any), the
>information is provided on an AS-IS basis without any express or implied
>warranties or liabilities. To the extent you are relying on this
>information, you are doing so at your own risk. If you are not the
>intended recipient, please notify the sender immediately by replying to
>this message and destroy all copies of this message and any attachments.
>Neither the sender nor the company/group of companies he or she
>represents shall be liable for the proper and complete transmission of
>the information contained in this communication, or for any delay in its
>receipt.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From scale at us.ibm.com  Fri Sep 29 13:26:51 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Fri, 29 Sep 2017 07:26:51 -0500
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
In-Reply-To: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
References: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
Message-ID: <OF685C1BCB.06119E7F-ON852581AA.0043AD1E-862581AA.00446050@notes.na.collabserv.com>

There isn't a "recommended" efix as such.  Generally, fixes go into the
next ptf so that they go through a test cycle.  If a customer hits a
serious issue that cannot wait for the next ptf, they can request an efix
be built, but since efixes do not get the same level of rigorous testing as
a ptf, they are not generally recommended unless you report an issue and
service determines you need it.

To address your other questions:
   We are currently up to efix3 on 4.2.3.4
   We don't announce PTF dates, because they depend upon the testing;
   however, you can see that we generally release a PTF roughly every 6
   weeks and I believe ptf4 was out on 8/24

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]"
            <aaron.s.knister at nasa.gov>
To:	"discussion, gpfsug main" <gpfsug-discuss at spectrumscale.org>
Date:	09/28/2017 08:59 PM
Subject:	[gpfsug-discuss] Latest recommended 4.2 efix?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of
some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively,
any word on a 4.2.3.5 release date?

-Aaron

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=IVcYH9EDg-UaA4Jt2GbsxN5XN1XbvejXTX0gAzNxtpM&s=9SmogyyA6QNSWxlZrpE-vBbslts0UexwJwPzp78LgKs&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment-0001.gif>

From sandeep.patil at in.ibm.com  Sat Sep 30 05:02:22 2017
From: sandeep.patil at in.ibm.com (Sandeep Ramesh)
Date: Sat, 30 Sep 2017 09:32:22 +0530
Subject: [gpfsug-discuss] Spectrum Scale Enablement Material - 1H 2017
Message-ID: <OF864F88AC.3E69D527-ON652581AB.00150F44-652581AB.00163072@notes.na.collabserv.com>

Hi Folks

I was asked by Doris Conti to send the below to our Spectrum Scale User 
group.

Below is a consolidated link that list all the enablement on Spectrum 
Scale/ESS that was done in 1H 2017 - which have blogs and videos from 
development and offering management.
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media

Do note, Spectrum Scale developers keep blogging on the below site which 
is worth bookmarking: https://developer.ibm.com/storage/blog/
(as recent as 4 new blogs in Sept)

Thanks
Sandeep 
Linkedin: https://www.linkedin.com/in/sandeeprpatil
Spectrum Scale Dev.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170930/4e06399d/attachment-0001.htm>

From r.sobey at imperial.ac.uk  Fri Sep  1 09:45:24 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 08:45:24 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/12dc1c48/attachment-0002.htm>

From bart.vandamme at sdnsquare.com  Fri Sep  1 10:30:59 2017
From: bart.vandamme at sdnsquare.com (Bart Van Damme)
Date: Fri, 1 Sep 2017 11:30:59 +0200
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
Message-ID: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>

We are a company located in Belgium that mainly implements spectrum scale
clusters in the Media and broadcasting industry.

Currently we have a customer who wants to export the scale file system over
samba 4.5 and 4.6.
In these versions the SMB2 leases are activated by default for enhancing
the oplocks system.

The problem is when this option is not disabled Adobe (and probably
Windows) is not notified the size of the file have changed, resulting that
reading growing file in Adobe is not working, the timeline is not updated.

Does anybody had this issues before and know how to solve it.


This is the smb.conf file:


============================

# Global options

smb2 leases = yes

client use spnego = yes

clustering = yes

unix extensions = no

mangled names = no

ea support = yes

store dos attributes = yes

map readonly = no

map archive = yes

map system = no

force unknown acl user = yes

obey pam restrictions = no

deadtime = 480

disable netbios = yes

server signing = disabled

server min protocol = SMB2

smb encrypt = off

# We do not allow guest usage.

guest ok = no

guest account = nobody

map to guest = bad user

# disable printing

load printers = no

printing = bsd

printcap name = /dev/null

disable spoolss = yes

# log settings

log file = /var/log/samba/log.%m

# max 500KB per log file, then rotate

max log size = 500

log level = 1 passdb:1 auth:1 winbind:1  idmap:1

#============ Share Definitions ============

[pfs]

comment = GPFS

path = /gpfs/pfs

valid users = @ug_numpr

writeable = yes

inherit permissions = yes

create mask = 664

force create mode = 664

nfs4:chown = yes

nfs4:acedup = merge

nfs4:mode = special

fileid:algorithm = fsname

vfs objects = shadow_copy2 gpfs fileid full_audit

full_audit:prefix = %u|%I|%m|%S

full_audit:success = rename unlink rmdir

full_audit:failure = none

full_audit:facility = local6

full_audit:priority = NOTICE

shadow:fixinodes = yes

gpfs:sharemodes = yes

gpfs:winattr = yes

gpfs:leases = no

locking = yes

posix locking = yes

oplocks = yes

kernel oplocks = no


Grtz,

Bart

*Bart Van Damme *

*Customer Project Manager*

*SDNsquare*
Technologiepark 3,
9052 Zwijnaarde, Belgium
www.sdnsquare.com

T:  + 32 9 241 56 01
<09%20241%2056%2001>
M: + 32 496 59 23 09


*This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.*

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virusvrij.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/873bf698/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Fri Sep  1 14:36:56 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 1 Sep 2017 13:36:56 +0000
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
In-Reply-To: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB32252517B40D8C4E5CA5A0EADF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225B01281871606C60C6333DF920@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD Server Nodes, the only columns with good data are Name, State and NSD Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a while, I don't think I've seen it work since 4.2.1 which was the first time I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/2a4162e9/attachment-0002.htm>

From ewahl at osc.edu  Fri Sep  1 21:56:25 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Fri, 1 Sep 2017 16:56:25 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c@osc.edu>

Howdy.   Just noticed this change to min RDMA packet size and I don't seem to
see it in any patch notes.  Maybe I just skipped the one where this changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384 

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix. 
Current GPFS build: "4.2.2.3 efix21 (1028007)".

-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From akers at vt.edu  Fri Sep  1 22:06:15 2017
From: akers at vt.edu (Joshua Akers)
Date: Fri, 01 Sep 2017 21:06:15 +0000
Subject: [gpfsug-discuss] Quorum managers
Message-ID: <CAHO5rBG+PkntpshV105j54+O4CtcDXqQCb9AJutq-s_PEN0g3A@mail.gmail.com>

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
-- 
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/a49947db/attachment-0002.htm>

From oehmes at gmail.com  Fri Sep  1 23:42:55 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 01 Sep 2017 22:42:55 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <20170901165625.6e4edd4c@osc.edu>
References: <20170901165625.6e4edd4c@osc.edu>
Message-ID: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf

hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/b75cfc74/attachment-0002.htm>

From truongv at us.ibm.com  Fri Sep  1 23:56:23 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Fri, 1 Sep 2017 18:56:23 -0400
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.1880.1504305787.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <OF5FAAEAFD.EA16DDD3-ON8525818E.007DC7B6-8525818E.007E031F@notes.na.collabserv.com>


The discrepancy between the mmlsconfig view and mmdiag has been fixed in
GFPS 4.2.3 version.  Note, mmdiag reports the correct default value.

Tru.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/01/2017 06:43 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 2
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: GPFS GUI Nodes > NSD no data (Sobey, Richard A)
   2. Change to default for verbsRdmaMinBytes? (Edward Wahl)
   3. Quorum managers (Joshua Akers)
   4. Re: Change to default for verbsRdmaMinBytes? (Sven Oehme)


----------------------------------------------------------------------

Message: 1
Date: Fri, 1 Sep 2017 13:36:56 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS GUI Nodes > NSD no data
Message-ID:

<HE1PR0602MB3225B01281871606C60C6333DF920 at HE1PR0602MB3225.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Resolved this, guessed at changing GPFSNSDDisk.period to 5.

From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey,
Richard A
Sent: 01 September 2017 09:45
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS GUI Nodes > NSD no data

For some time now if I go into the GUI, select Monitoring > Nodes > NSD
Server Nodes, the only columns with good data are Name, State and NSD
Count. Everything else e.g. Avg Disk Wait Read is listed "N/A".

Is this another config option I need to enable? It's been bugging me for a
while, I don't think I've seen it work since 4.2.1 which was the first time
I saw the GUI.

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_2a4162e9_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=jcPGl5zwtQFMbnEmBpNErsD43uwoVeKgKk_8j7ZeCJY&e=
 >

------------------------------

Message: 2
Date: Fri, 1 Sep 2017 16:56:25 -0400
From: Edward Wahl <ewahl at osc.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID: <20170901165625.6e4edd4c at osc.edu>
Content-Type: text/plain; charset="US-ASCII"

Howdy.   Just noticed this change to min RDMA packet size and I don't seem
to
see it in any patch notes.  Maybe I just skipped the one where this
changed?

 mmlsconfig verbsRdmaMinBytes
verbsRdmaMinBytes 16384

(in case someone thinks we changed it)

[root at proj-nsd01 ~]# mmlsconfig |grep verbs
verbsRdma enable
verbsRdma disable
verbsRdmasPerConnection 14
verbsRdmasPerNode 1024
verbsPorts mlx5_3/1
verbsPorts mlx4_0
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_1
verbsPorts mlx4_1/1
verbsPorts mlx4_1/2


Oddly I also see this in config, though I've seen these kinds of things
before.
mmdiag --config |grep verbsRdmaMinBytes
   verbsRdmaMinBytes 8192

We're on a recent efix.
Current GPFS build: "4.2.2.3 efix21 (1028007)".

--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


------------------------------

Message: 3
Date: Fri, 01 Sep 2017 21:06:15 +0000
From: Joshua Akers <akers at vt.edu>
To: "gpfsug-discuss at spectrumscale.org"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Quorum managers
Message-ID:
		 <CAHO5rBG+PkntpshV105j54
+O4CtcDXqQCb9AJutq-s_PEN0g3A at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

I was wondering how most people set up quorum managers. We historically had
physical admin nodes be the quorum managers, but are switching to a
virtualized admin services infrastructure. We have been choosing a few
compute nodes to act as quorum managers in our client clusters, but have
considered using virtual machines instead. Has anyone else done this?

Regards,
Josh
--
*Joshua D. Akers*

*HPC Team Lead*
NI&S Systems Support (MC0214)
1700 Pratt Drive
Blacksburg, VA 24061
540-231-9506
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_a49947db_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=Gag0raQbp7KZAyINlnmuxlnpjboo9XOWO3dDL2HCsZo&e=
 >

------------------------------

Message: 4
Date: Fri, 01 Sep 2017 22:42:55 +0000
From: Sven Oehme <oehmes at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
Message-ID:
		 <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=
+czkxMP3xCMg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Ed,

yes the defaults for that have changed for customers who had not overridden
the default settings. the reason we did this was that many systems in the
field including all ESS systems that come pre-tuned where manually changed
to 8k from the 16k default due to better performance that was confirmed in
multiple customer engagements and tests with various settings , therefore
we change the default to what it should be in the field so people are not
bothered to set it anymore (simplification) or get benefits by changing the
default to provides better performance.
all this happened when we did the communication code overhaul that did lead
to significant (think factors) of improved RPC performance for RDMA and
VERBS workloads.
there is another round of significant enhancements coming soon , that will
make even more parameters either obsolete or change some of the defaults
for better out of the box performance.
i see that we should probably enhance the communication of this changes,
not that i think this will have any negative effect compared to what your
performance was with the old setting i am actually pretty confident that
you get better performance with the new code, but by setting parameters
back to default on most 'manual tuned' probably makes your system even
faster.
if you have a Scale Client on 4.2.3+ you really shouldn't have anything set
beside maxfilestocache, pagepool, workerthreads and potential prefetch , if
you are a protocol node, this and settings specific to an  export (e.g.
SMB, NFS set some special settings) , pretty much everything else these
days should be set to default so the code can pick the correct parameters.,
if its not and you get better performance by manual tweaking something i
like to hear about it.
on the communication side in the next release will eliminate another set of
parameters that are now 'auto set' and we plan to work on NSD next.
i presented various slides about the communication and simplicity changes
in various forums, latest public non NDA slides i presented are here -->
https://urldefense.proofpoint.com/v2/url?u=http-3A__files.gpfsug.org_presentations_2017_Manchester_08-5FResearch-5FTopics.pdf&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=8c_55Ld_iAC2sr_QU0cyGiOiyU7Z9NjcVknVuRpRIlk&e=


hope this helps .

Sven


On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu> wrote:

> Howdy.   Just noticed this change to min RDMA packet size and I don't
seem
> to
> see it in any patch notes.  Maybe I just skipped the one where this
> changed?
>
>  mmlsconfig verbsRdmaMinBytes
> verbsRdmaMinBytes 16384
>
> (in case someone thinks we changed it)
>
> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
> verbsRdma enable
> verbsRdma disable
> verbsRdmasPerConnection 14
> verbsRdmasPerNode 1024
> verbsPorts mlx5_3/1
> verbsPorts mlx4_0
> verbsPorts mlx5_0
> verbsPorts mlx5_0 mlx5_1
> verbsPorts mlx4_1/1
> verbsPorts mlx4_1/2
>
>
> Oddly I also see this in config, though I've seen these kinds of things
> before.
> mmdiag --config |grep verbsRdmaMinBytes
>    verbsRdmaMinBytes 8192
>
> We're on a recent efix.
> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
> --
>
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302 <(614)%20292-9302>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170901_b75cfc74_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=LpVpXMgqE_LD-t_J7yfNwURUrdUR29TzWvjVTi18kpA&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=yK4FkYvJ21ubvurR6W1Pi3qvNw9ydj2XP0ghXPc7DUw&s=xZHUN9ZlFjvgBmBB8wnX2cQDQQV42R_q-xHubNA3JBM&e=


End of gpfsug-discuss Digest, Vol 68, Issue 2
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170901/9ac0aa6b/attachment-0002.gif>

From r.sobey at imperial.ac.uk  Sat Sep  2 10:35:34 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Sat, 2 Sep 2017 09:35:34 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/4f65f336/attachment-0002.htm>

From truongv at us.ibm.com  Sat Sep  2 12:40:15 2017
From: truongv at us.ibm.com (Truong Vu)
Date: Sat, 2 Sep 2017 07:40:15 -0400
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
References: <mailman.1.1504350001.57611.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFD4B893A8.5AF018F4-ON8525818F.003EF746-8525818F.00401C38@notes.na.collabserv.com>


The dates that have the zone abbreviation are from the scripts which use
the OS date command.  The daemon has its own format.  This inconsistency
has been address in 4.2.2.


From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/02/2017 07:00 AM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 4
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Date formats inconsistent mmfs.log (Sobey, Richard A)


----------------------------------------------------------------------

Message: 1
Date: Sat, 2 Sep 2017 09:35:34 +0000
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk>
To: "'gpfsug-discuss at spectrumscale.org'"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
Message-ID:

<VI1PR0602MB32292C575EFD00F708085F67DF930 at VI1PR0602MB3229.eurprd06.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

Is there a good reason for the date formats in mmfs.log to be inconsistent?
Apart from my OCD getting the better of me, it makes log analysis a bit
difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script
mmSysMonGpfsStartup: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script
mmSinceShutdownRoleChange: event startup, Async
command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send
a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is
already running. Pid=5134

Cheers
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20170902_4f65f336_attachment-2D0001.html&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=fNT71mM8obJ9rwxzm3Uzxw4mayi2pQg1u950E1raYK4&e=
 >

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM&m=wiPE5K_0qzTwdloCshNcSyamVNRJKz5WyOBal7dMz8w&s=pd3-zi8UQxVOjxOYxqbuaFSvv_71WENUBJsw0KUV3ro&e=


End of gpfsug-discuss Digest, Vol 68, Issue 4
*********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170902/70c36847/attachment-0002.gif>

From john.hearns at asml.com  Mon Sep  4 08:43:59 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 4 Sep 2017 07:43:59 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>
Message-ID: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

*        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
*        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/4b60a071/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Mon Sep  4 09:05:10 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 4 Sep 2017 08:05:10 +0000
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log
In-Reply-To: <HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <VI1PR0602MB32292C575EFD00F708085F67DF930@VI1PR0602MB3229.eurprd06.prod.outlook.com>,
	<HE1PR02MB1450951782CBF096C3282B4088910@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322515F17C33CEF7FCBFDB00DF910@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Ah. I'm running 4.2.3 but haven't changed the release level. I'll get that sorted out.

Thanks for the replies!

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of John Hearns <john.hearns at asml.com>
Sent: Monday, September 4, 2017 8:43:59 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Date formats inconsistent mmfs.log

Richard,
The date format changed at an update level.
We recently updated to 4.2.3 and when you run mmchconfig release=LATEST you are prompted to confirm that the new log format can be used.
I guess you might not have cut all nodes over yet on your update over the weekend?

Cut and paste from the documentation:


mmfsLogTimeStampISO8601={yes | no}

Setting this parameter to no allows the cluster to continue running with the earlier log time stamp format.
For more information, see Security mode<https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_securitymode.htm?view=kc#bl1adm_securitymode>.

?        Set mmfsLogTimeStampISO8061 to no if you save log information and you are not yet ready to switch to the new log time stamp format.
After you complete the migration, you can change the log time stamp format at any time with the mmchconfig command.
?        Omit this parameter if you are ready to switch to the new format. The default value is yes


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: Saturday, September 02, 2017 11:36 AM
To: 'gpfsug-discuss at spectrumscale.org' <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Date formats inconsistent mmfs.log

Is there a good reason for the date formats in mmfs.log to be inconsistent? Apart from my OCD getting the better of me, it makes log analysis a bit difficult.

Sat Sep  2 10:33:42.145 2017: [I] Command: successful mount gpfs
Sat  2 Sep 10:33:42 BST 2017: finished mounting /dev/gpfs
Sat Sep  2 10:33:42.168 2017: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
Sat Sep  2 10:33:42.190 2017: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.
Sat  2 Sep 10:33:42 BST 2017: [I] sendRasEventToMonitor: Successfully send a filesystem event to monitor
Sat  2 Sep 10:33:42 BST 2017: [I] The Spectrum Scale monitoring service is already running. Pid=5134

Cheers
Richard
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/22e41b16/attachment-0002.htm>

From ckrafft at de.ibm.com  Mon Sep  4 13:02:49 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Mon, 4 Sep 2017 12:02:49 +0000
Subject: [gpfsug-discuss] Looking for Use-Cases with Spectrum Scale / ESS
	with vRanger & VMware
Message-ID: <OF68690758.28E4DC1B-ON00258191.00422D1C-00258191.00422D1F@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170904/fc167793/attachment-0002.htm>

From heiner.billich at psi.ch  Mon Sep  4 17:48:20 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Mon, 4 Sep 2017 16:48:20 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>

Hello,

We use AFM prefetch to migrate data between two clusters (using NFS). This works fine with large files, say 1+GB. But we have millions of smaller files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 1000+MB/s we get for larger files.

I assume that we would need more parallelism, does prefetch pull just one file at a time? So each file needs  some or many metadata operations plus a single  or just a few read and writes. Doing this sequentially adds up all the latencies of NFS+GPFS. This is my explanation. With larger files gpfs prefetch on home will help.

Please can anybody comment: Is this right, does AFM prefetch handle one file at a time in a sequential manner? And is there any way to change this behavior? Or am I wrong and I need to look elsewhere to get better performance for prefetch of many smaller files?

We will migrate several filesets in parallel, but still with individual filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 files/s seconds looks poor.

The setup is quite new, hence there may be other places to look at. 
It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.

Thank you,

Heiner
--,
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 
    
From vpuvvada at in.ibm.com  Tue Sep  5 15:27:21 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Tue, 5 Sep 2017 19:57:21 +0530
Subject: [gpfsug-discuss] Use AFM for migration of many small files
In-Reply-To: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
References: <467FB293-D33B-46F4-BA1B-A5CB4D28DDE6@psi.ch>
Message-ID: <OF97D680D8.B6748981-ON65258192.004D7F18-65258192.004F6849@notes.na.collabserv.com>

Which version of Spectrum Scale ? What is the fileset mode ?

>We use AFM prefetch to migrate data between two clusters (using NFS). 
This works fine with large files, say 1+GB. But we have millions of 
smaller files,  about 1MB each. Here >I see just ~150MB/s ? compare this 
to the 1000+MB/s we get for larger files.

How was the performance measured ? If parallel IO is enabled, AFM uses 
multiple gateway nodes to prefetch the large files (if file size if more 
than 1GB). Performance difference between small and lager file is huge 
(1000MB - 150MB = 850MB) here, and generally it is not the case. How many 
files were present in list file for prefetch ? Could you also share full 
internaldump from the gateway node ? 

>I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few >read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.

AFM prefetches the files on multiple threads. Default flush threads for 
prefetch are 36 (fileset.afmNumFlushThreads (default 4) + 
afmNumIOFlushThreads (default 32)). 

>Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and >I need to look elsewhere to get better 
performance for prefetch of many smaller files?

See above, AFM reads files on multiple threads parallelly.  Try increasing 
the afmNumFlushThreads on fileset and verify if it improves the 
performance.

~Venkat (vpuvvada at in.ibm.com)


From:   "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch>
To:     "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
Date:   09/04/2017 10:18 PM
Subject:        [gpfsug-discuss] Use AFM for migration of many small files
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hello,


We use AFM prefetch to migrate data between two clusters (using NFS). This 
works fine with large files, say 1+GB. But we have millions of smaller 
files,  about 1MB each. Here I see just ~150MB/s ? compare this to the 
1000+MB/s we get for larger files.


I assume that we would need more parallelism, does prefetch pull just one 
file at a time? So each file needs  some or many metadata operations plus 
a single  or just a few read and writes. Doing this sequentially adds up 
all the latencies of NFS+GPFS. This is my explanation. With larger files 
gpfs prefetch on home will help.


Please can anybody comment: Is this right, does AFM prefetch handle one 
file at a time in a sequential manner? And is there any way to change this 
behavior? Or am I wrong and I need to look elsewhere to get better 
performance for prefetch of many smaller files?


We will migrate several filesets in parallel, but still with individual 
filesets up to 350TB in size 150MB/s isn?t fun. Also just about 150 
files/s seconds looks poor.


The setup is quite new, hence there may be other places to look at. 

It?s all RHEL7 an spectrum scale 4.2.2-3 on the afm cache.


Thank you,


Heiner

--,

Paul Scherrer Institut

Science IT

Heiner Billich

WHGA 106

CH 5232  Villigen PSI

056 310 36 02

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.psi.ch&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=eHcVdovN10-m-Qk0Ln2qvol3pkKNFwrzz2wgf1zXVXE&e= 


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=4y79Y-3M5sHV1Fm6aUFPEDIl8W5jxVP64XSlBsAYBb4&s=LbRyuSM_djs0FDXr27hPottQHAn3OGcivpyRcIDBN3U&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170905/3b28f7f8/attachment-0002.htm>

From kenneth.waegeman at ugent.be  Wed Sep  6 12:55:20 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 6 Sep 2017 13:55:20 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
Message-ID: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>

Hi Sven,

I see two parameters that we have set to non-default values that are not 
in your list of options still to configure.

verbsRdmasPerConnection (256) and
socketMaxListenConnections (1024)

I remember we had to set socketMaxListenConnections because our cluster 
consist of +550 nodes.

Are these settings still needed, or is this also tackled in the code?

Thank you!!

Cheers,
Kenneth


On 02/09/17 00:42, Sven Oehme wrote:
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not 
> overridden the default settings. the reason we did this was that many 
> systems in the field including all ESS systems that come pre-tuned 
> where manually changed to 8k from the 16k default due to better 
> performance that was confirmed in multiple customer engagements and 
> tests with various settings , therefore we change the default to what 
> it should be in the field so people are not bothered to set it anymore 
> (simplification) or get benefits by changing the default to provides 
> better performance.
> all this happened when we did the communication code overhaul that did 
> lead to significant (think factors) of improved RPC performance for 
> RDMA and VERBS workloads.
> there is another round of significant enhancements coming soon , that 
> will make even more parameters either obsolete or change some of the 
> defaults for better out of the box performance.
> i see that we should probably enhance the communication of this 
> changes, not that i think this will have any negative effect compared 
> to what your performance was with the old setting i am actually pretty 
> confident that you get better performance with the new code, but by 
> setting parameters back to default on most 'manual tuned' probably 
> makes your system even faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have 
> anything set beside maxfilestocache, pagepool, workerthreads and 
> potential prefetch , if you are a protocol node, this and settings 
> specific to an  export (e.g. SMB, NFS set some special settings) , 
> pretty much everything else these days should be set to default so the 
> code can pick the correct parameters., if its not and you get better 
> performance by manual tweaking something i like to hear about it.
> on the communication side in the next release will eliminate another 
> set of parameters that are now 'auto set' and we plan to work on NSD 
> next.
> i presented various slides about the communication and simplicity 
> changes in various forums, latest public non NDA slides i presented 
> are here --> 
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl <ewahl at osc.edu 
> <mailto:ewahl at osc.edu>> wrote:
>
>     Howdy.  Just noticed this change to min RDMA packet size and I
>     don't seem to
>     see it in any patch notes.  Maybe I just skipped the one where
>     this changed?
>
>      mmlsconfig verbsRdmaMinBytes
>     verbsRdmaMinBytes 16384
>
>     (in case someone thinks we changed it)
>
>     [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>     verbsRdma enable
>     verbsRdma disable
>     verbsRdmasPerConnection 14
>     verbsRdmasPerNode 1024
>     verbsPorts mlx5_3/1
>     verbsPorts mlx4_0
>     verbsPorts mlx5_0
>     verbsPorts mlx5_0 mlx5_1
>     verbsPorts mlx4_1/1
>     verbsPorts mlx4_1/2
>
>
>     Oddly I also see this in config, though I've seen these kinds of
>     things before.
>     mmdiag --config |grep verbsRdmaMinBytes
>        verbsRdmaMinBytes 8192
>
>     We're on a recent efix.
>     Current GPFS build: "4.2.2.3 efix21 (1028007)".
>
>     --
>
>     Ed Wahl
>     Ohio Supercomputer Center
>     614-292-9302 <tel:%28614%29%20292-9302>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c0192e11/attachment-0002.htm>

From olaf.weiser at de.ibm.com  Wed Sep  6 13:22:41 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 6 Sep 2017 14:22:41 +0200
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu><CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <OF3CD1E848.DCC8DD84-ONC1258193.0043A0CB-C1258193.0043FEC5@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e8bdd0b1/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Wed Sep  6 13:29:44 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Wed, 6 Sep 2017 12:29:44 +0000
Subject: [gpfsug-discuss] Save the date! GPFS-UG meeting at SC17 - Sunday
	November 12th
Message-ID: <7838054B-8A46-46A0-8A53-81E3049B4AE7@nuance.com>

The 2017 Supercomputing conference is only 2 months away, and here?s a reminder to come early and attend the GPFS user group meeting. The meeting is tentatively scheduled from the afternoon of Sunday, November 12th. Exact location and times are still being discussed.

If you have an interest in presenting at the user group meeting, please let us know.

More details in the coming weeks.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/171593af/attachment-0002.htm>

From damir.krstic at gmail.com  Wed Sep  6 13:35:45 2017
From: damir.krstic at gmail.com (Damir Krstic)
Date: Wed, 06 Sep 2017 12:35:45 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
Message-ID: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to the
/projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/7a6df4dd/attachment-0002.htm>

From S.J.Thompson at bham.ac.uk  Wed Sep  6 13:43:09 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Wed, 6 Sep 2017 12:43:09 +0000
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <D5D5ABA0.51CF3%s.j.thompson@bham.ac.uk>

Filesets in filesets are fine. BUT if you use scoped backups with TSM... Er Spectrum Protect, then there are restrictions on creating an IFS inside an IFS ...

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>" <damir.krstic at gmail.com<mailto:damir.krstic at gmail.com>>
Reply-To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, 6 September 2017 at 13:35
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [gpfsug-discuss] filesets inside of filesets

Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports that. Otherwise, I was going to create a separate filesystem, create corresponding backup filesets on it and symlink them to the /projects/<projectname> directory.

Thanks in advance.

Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/f4a93f39/attachment-0002.htm>

From rohwedder at de.ibm.com  Wed Sep  6 13:51:47 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Wed, 6 Sep 2017 14:51:47 +0200
Subject: [gpfsug-discuss] filesets inside of filesets
In-Reply-To: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
References: <CAKV+WqfUKSeCG1mVN9hE58vO4S0CTqNxyqs0zRn4Pr++UwnZgg@mail.gmail.com>
Message-ID: <OFD591AA9A.36197B9E-ON00258193.0045ED39-C1258193.0046A8CE@notes.na.collabserv.com>


Hello Damir,

the files that belong to your fileset "backup" has a separate quota, it is
not related to the quota in "b1000".
There is no cumulative quota.

Fileset Nesting may need other considerations as well, in some cases
filesets behave different than simple directories.
-> For NFSV4 ACLs, inheritance stops at the fileset boundaries
-> Snapshots include the independent parent and the dependent children.
Nested independent filesets are not included in a fileset snapshot.
-> Export protocols like NFS or SMB will cross fileset boundaries and just
treat them like a directory.

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	Damir Krstic <damir.krstic at gmail.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/06/2017 02:36 PM
Subject:	[gpfsug-discuss] filesets inside of filesets
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Today we have following fileset structure on our filesystem:

/projects <-- gpfs filesystem

/projects/b1000 <-- b1000 is a fileset with a fileset quota applied to it

I need to create a fileset or a directory inside of this project and have
separate quota applied to it e.g.:

/projects/b1000 (b1000 has 10TB quota applied)
/projects/b1000/backup (backup has 1TB quota applied)

Is this possible? I am thinking nested filesets would work if GPFS supports
that. Otherwise, I was going to create a separate filesystem, create
corresponding backup filesets on it and symlink them to
the /projects/<projectname> directory.

Thanks in advance.

Damir_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=5jyA3TazAAOckIeQUeIG0CJ4TG0aMWv7jDLDk3gYNkE&s=CbzPKTgh7mO6om2LTQr94LM1qfshrEdm58cJydejAfE&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0006.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1B378274.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0007.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/c43591f4/attachment-0008.gif>

From oehmes at gmail.com  Wed Sep  6 14:32:40 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Wed, 06 Sep 2017 13:32:40 +0000
Subject: [gpfsug-discuss] Change to default for verbsRdmaMinBytes?
In-Reply-To: <a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
References: <20170901165625.6e4edd4c@osc.edu>
	<CALssuR3-KmWd4LqKZwQkP7_F6+aX3JM-KqRnc=+czkxMP3xCMg@mail.gmail.com>
	<a86390fd-7565-8f33-8d48-1ab05b444e46@ugent.be>
Message-ID: <CALssuR3ocqZ624YLfFXPYdqgJKqCPdCndvJpUURXj-_=-KKY-w@mail.gmail.com>

Hi,

you still need both of them, but they are both on the list to be removed,
the first is already integrated for the next major release, the 2nd we
still work on.

Sven

On Wed, Sep 6, 2017 at 4:55 AM Kenneth Waegeman <kenneth.waegeman at ugent.be>
wrote:

> Hi Sven,
>
> I see two parameters that we have set to non-default values that are not
> in your list of options still to configure.
> verbsRdmasPerConnection (256) and
> socketMaxListenConnections (1024)
>
> I remember we had to set socketMaxListenConnections because our cluster
> consist of +550 nodes.
>
> Are these settings still needed, or is this also tackled in the code?
>
> Thank you!!
>
> Cheers,
> Kenneth
>
>
>
> On 02/09/17 00:42, Sven Oehme wrote:
>
> Hi Ed,
>
> yes the defaults for that have changed for customers who had not
> overridden the default settings. the reason we did this was that many
> systems in the field including all ESS systems that come pre-tuned where
> manually changed to 8k from the 16k default due to better performance that
> was confirmed in multiple customer engagements and tests with various
> settings , therefore we change the default to what it should be in the
> field so people are not bothered to set it anymore (simplification) or get
> benefits by changing the default to provides better performance.
> all this happened when we did the communication code overhaul that did
> lead to significant (think factors) of improved RPC performance for RDMA
> and VERBS workloads.
> there is another round of significant enhancements coming soon , that will
> make even more parameters either obsolete or change some of the defaults
> for better out of the box performance.
> i see that we should probably enhance the communication of this changes,
> not that i think this will have any negative effect compared to what your
> performance was with the old setting i am actually pretty confident that
> you get better performance with the new code, but by setting parameters
> back to default on most 'manual tuned' probably makes your system even
> faster.
> if you have a Scale Client on 4.2.3+ you really shouldn't have anything
> set beside maxfilestocache, pagepool, workerthreads and potential prefetch
> , if you are a protocol node, this and settings specific to an  export
> (e.g. SMB, NFS set some special settings) , pretty much everything else
> these days should be set to default so the code can pick the correct
> parameters., if its not and you get better performance by manual tweaking
> something i like to hear about it.
> on the communication side in the next release will eliminate another set
> of parameters that are now 'auto set' and we plan to work on NSD next.
> i presented various slides about the communication and simplicity changes
> in various forums, latest public non NDA slides i presented are here -->
> http://files.gpfsug.org/presentations/2017/Manchester/08_Research_Topics.pdf
>
> hope this helps .
>
> Sven
>
>
>
> On Fri, Sep 1, 2017 at 1:56 PM Edward Wahl < <ewahl at osc.edu>ewahl at osc.edu>
> wrote:
>
>> Howdy.   Just noticed this change to min RDMA packet size and I don't
>> seem to
>> see it in any patch notes.  Maybe I just skipped the one where this
>> changed?
>>
>>  mmlsconfig verbsRdmaMinBytes
>> verbsRdmaMinBytes 16384
>>
>> (in case someone thinks we changed it)
>>
>> [root at proj-nsd01 ~]# mmlsconfig |grep verbs
>> verbsRdma enable
>> verbsRdma disable
>> verbsRdmasPerConnection 14
>> verbsRdmasPerNode 1024
>> verbsPorts mlx5_3/1
>> verbsPorts mlx4_0
>> verbsPorts mlx5_0
>> verbsPorts mlx5_0 mlx5_1
>> verbsPorts mlx4_1/1
>> verbsPorts mlx4_1/2
>>
>>
>> Oddly I also see this in config, though I've seen these kinds of things
>> before.
>> mmdiag --config |grep verbsRdmaMinBytes
>>    verbsRdmaMinBytes 8192
>>
>> We're on a recent efix.
>> Current GPFS build: "4.2.2.3 efix21 (1028007)".
>>
>> --
>>
>> Ed Wahl
>> Ohio Supercomputer Center
>> 614-292-9302 <%28614%29%20292-9302>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/e1f559d1/attachment-0002.htm>

From heiner.billich at psi.ch  Wed Sep  6 17:16:18 2017
From: heiner.billich at psi.ch (Billich Heinrich Rainer (PSI))
Date: Wed, 6 Sep 2017 16:16:18 +0000
Subject: [gpfsug-discuss] Use AFM for migration of many small files
Message-ID: <7D6EFD03-5D74-4A7B-A0E8-2AD41B050E15@psi.ch>

Hello Venkateswara, Edward,

Thank you for the comments on how to speed up AFM prefetch with small files. We run 4.2.2-3 and the AFM mode is RO and we have just a single gateway, i.e. no parallel reads for large files. We will try to increase the value of afmNumFlushThreads. It wasn?t clear to me that these threads do read from home, too - at least for prefetch. First I will try a plain NFS mount and see how parallel reads of many small files  scale the throughput. Next I will try AFM prefetch. I don?t do nice benchmarking, just watching dstat output. We prefetch 100?000 files in one bunch, so there is ample time to observe. 

The basic issue is that we get just about 45MB/s for sequential read of  many 1000 files with 1MB per file on the home cluster. I.e. we read one file at a time before we switch to the next. This is no surprise. Each read takes about 20ms to complete, so at max we get 50 reads of 1MB per second. We?ve seen this on classical raid storage and on DSS/ESS systems. It?s likely just the physics of spinning disks and the fact that we do one read at a time and don?t allow any parallelism. We wait for one or two I/Os to single disks to complete before we continue  With larger files prefetch jumps in and fires many reads in parallel ? To get 1?000MB/s I need to do 1?000 read/s  and need to have ~20 reads in progress in parallel  all the time ? we?ll see how close we get to 1?000MB/s with ?many small files?.

Kind regards,

Heiner
--
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232  Villigen PSI
056 310 36 02
https://www.psi.ch
 

From stijn.deweirdt at ugent.be  Wed Sep  6 18:13:48 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 6 Sep 2017 19:13:48 +0200
Subject: [gpfsug-discuss] mixed verbsRdmaSend
Message-ID: <f2598b5c-f7c5-48f5-45ad-862976d97be5@ugent.be>

hi all,

what is the expected behaviour of a mixed verbsRdmaSend setup: some
nodes enabled, most disabled.

we have some nodes that have a very high iops workload, but most of the
cluster of 500+ nodes do not have such usecase.
we enabled verbsRdmaSend on the managers/quorum nodes (<10) and on the
few (<10) clients with this workload, but not on the others (500+). it
seems to work out fine, but is this acceptable as config? (the docs
mention that enabling verbsrdamSend on a> 100 nodes might lead to errors).


the nodes use ipoib as ip network, and running with verbsRdmaSend
disabled on all nodes leads to unstable cluster (TX errors (<1 error in
1M packets) on some clients leading to gpfs expel nodes etc).
(we still need to open a case wil mellanox to investigate further)

many thanks,

stijn


From gcorneau at us.ibm.com  Thu Sep  7 00:30:23 2017
From: gcorneau at us.ibm.com (Glen Corneau)
Date: Wed, 6 Sep 2017 18:30:23 -0500
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
Message-ID: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>

Sorry I missed the anniversary of your conception  (announcement letter) 
back on August 26th, so I hope you'll accept my belated congratulations on 
this long and exciting journey!

https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318

I remember your parent, PIOFS, as well!  Ahh the fun times.
------------------
Glen Corneau
Power Systems
Washington Systems Center
gcorneau at us.ibm.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 26117 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170906/24ef5f07/attachment-0002.jpe>

From xhejtman at ics.muni.cz  Thu Sep  7 16:07:20 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 7 Sep 2017 17:07:20 +0200
Subject: [gpfsug-discuss] Overwritting migrated files
Message-ID: <20170907150720.h3t5fowvdlibvik4@ics.muni.cz>

Hello,

we have files about 100GB per file. Many of these files are migrated to tapes.
(GPFS+TSM, tape storage is external pool and dsmmigrate, dsmrecall are in
place).

These files are images from bacula backup system. When bacula wants to reuse
some of images, it needs to truncate the file to 64kB and overwrite it.

Is there a way not to recall whole 100GB from tapes for only to truncate the
file?

I tried to do partial recall:
dsmrecall -D -size=65k Vol03797

after recall processing finished, I tried to truncate the file using:
dd if=/dev/zero of=Vol03797 count=0 bs=64k seek=1

which caused futher recall of the whole file:

$ dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:01:59
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
107380819676     10485760     31373312   m (p)  Vol03797

and ResB size has been growing to 107380819676.

After dd finished:

dsmls Vol03797
IBM Spectrum Protect
Command Line Space Management Client Interface
  Client Version 8, Release 1, Level 2.0 
  Client date/time: 09/07/2017 15:08:03
(c) Copyright by IBM Corporation and other(s) 1990, 2017. All Rights Reserved.

        ActS         ResS         ResB   FSt    FName
       65536        65536           64   r      Vol03797


Is there another way to truncate the file and drop whole migrated part?

-- 
Luk?? Hejtm?nek


From john.hearns at asml.com  Thu Sep  7 16:15:00 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:15:00 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
Message-ID: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/a7b29e8b/attachment-0002.htm>

From neil.wilson at metoffice.gov.uk  Thu Sep  7 16:33:58 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Thu, 7 Sep 2017 15:33:58 +0000
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

I think you need to configure a gateway node (use mmchnode to change an existing node class to gateway)
Then use mmafmconfig to setup export server maps on the gateway node.

e.g.
mmafmconfig -add "mapping1" -export-map "nfsServerIP"/"GatewayNode"  (double quotes not required)

mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig

If I have an AFM setup where the home is located on a generic NFS share, let's say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on /volume/share ?
I can mount the share as a plain old NFS mount in order to run this operation, before I create the cache side fileset.

Apologies if I am being dumb as an ox here, and I deserve to be slapped in the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/001ddb64/attachment-0002.htm>

From john.hearns at asml.com  Thu Sep  7 16:52:19 2017
From: john.hearns at asml.com (John Hearns)
Date: Thu, 7 Sep 2017 15:52:19 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
Message-ID: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>

Firmly lining myself up for a smack round the chops with a wet haddock...
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/292c9f2c/attachment-0002.htm>

From janusz.malka at desy.de  Thu Sep  7 20:23:36 2017
From: janusz.malka at desy.de (Malka, Janusz)
Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>

I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com> 
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
Sent: Thursday, 7 September, 2017 17:52:19 
Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 


Firmly lining myself up for a smack round the chops with a wet haddock? 

I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 

Mmdelfileset responds that : 

Fileset obfuscated has 1 fileset snapshot(s). 


When I try to delete the snapshot: 

Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 


I find this reference, which is about as useful as a wet haddock: 

[ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 


The advice of the gallery is sought, please. 


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
_______________________________________________ 
gpfsug-discuss mailing list 
gpfsug-discuss at spectrumscale.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/6da4c433/attachment-0002.htm>

From christof.schmitt at us.ibm.com  Thu Sep  7 22:16:34 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Thu, 7 Sep 2017 21:16:34 +0000
Subject: [gpfsug-discuss] SMB2 leases - oplocks - growing files
In-Reply-To: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
References: <CA+__oBhd98i7U6UaX+uYXpRTEUuKQP+nuVzdie6Qv2k-8kR6Hg@mail.gmail.com>
Message-ID: <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170907/04778952/attachment-0002.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 03:11:48 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 22:11:48 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
Message-ID: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>

Hi Everyone,

This is something that's come up in the past and has recently resurfaced 
with a project I've been working on, and that is-- it seems to me as 
though mmfsd never attempts to flush the cache of the block devices its 
writing to (looking at blktrace output seems to confirm this). Is this 
actually the case? I've looked at the gpl headers for linux and I don't 
see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or 
REQ_FLUSH. I'm sure there's other ways to trigger this behavior that 
GPFS may very well be using that I've missed. That's why I'm asking :)

I figure with FPO being pushed as an HDFS replacement using commodity 
drives this feature has *got* to be in the code somewhere.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From oehmes at gmail.com  Fri Sep  8 03:55:14 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 02:55:14 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
Message-ID: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>

I am not sure what exactly you are looking for but all blockdevices are
opened with O_DIRECT , we never cache anything on this layer .

On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov> wrote:

> Hi Everyone,
>
> This is something that's come up in the past and has recently resurfaced
> with a project I've been working on, and that is-- it seems to me as
> though mmfsd never attempts to flush the cache of the block devices its
> writing to (looking at blktrace output seems to confirm this). Is this
> actually the case? I've looked at the gpl headers for linux and I don't
> see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> GPFS may very well be using that I've missed. That's why I'm asking :)
>
> I figure with FPO being pushed as an HDFS replacement using commodity
> drives this feature has *got* to be in the code somewhere.
>
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b510bc51/attachment-0002.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 04:05:42 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:05:42 -0400
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
Message-ID: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>

Thanks Sven. I didn't think GPFS itself was caching anything on that 
layer, but it's my understanding that O_DIRECT isn't sufficient to force 
I/O to be flushed (e.g. the device itself might have a volatile caching 
layer). Take someone using ZFS zvol's as NSDs. I can write() all day log 
to that zvol (even with O_DIRECT) but there is absolutely no guarantee 
those writes have been committed to stable storage and aren't just 
sitting in RAM until an fsync() occurs (or some other bio function that 
causes a flush). I also don't believe writing to a SATA drive with 
O_DIRECT will force cache flushes of the drive's writeback cache.. 
although I just tested that one and it seems to actually trigger a scsi 
cache sync. Interesting.

-Aaron

On 9/7/17 10:55 PM, Sven Oehme wrote:
> I am not sure what exactly you are looking for but all blockdevices are 
> opened with O_DIRECT , we never cache anything on this layer .
> 
> 
> On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov 
> <mailto:aaron.s.knister at nasa.gov>> wrote:
> 
>     Hi Everyone,
> 
>     This is something that's come up in the past and has recently resurfaced
>     with a project I've been working on, and that is-- it seems to me as
>     though mmfsd never attempts to flush the cache of the block devices its
>     writing to (looking at blktrace output seems to confirm this). Is this
>     actually the case? I've looked at the gpl headers for linux and I don't
>     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
>     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
>     GPFS may very well be using that I've missed. That's why I'm asking :)
> 
>     I figure with FPO being pushed as an HDFS replacement using commodity
>     drives this feature has *got* to be in the code somewhere.
> 
>     -Aaron
> 
>     --
>     Aaron Knister
>     NASA Center for Climate Simulation (Code 606.2)
>     Goddard Space Flight Center
>     (301) 286-2776
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Fri Sep  8 04:26:02 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Thu, 7 Sep 2017 23:26:02 -0400
Subject: [gpfsug-discuss] Happy 20th birthday GPFS !!
In-Reply-To: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
References: <OFFCC20A19.E482CA4F-ON00258193.0080075F-86258193.00811FF3@notes.na.collabserv.com>
Message-ID: <4a9feeb2-bb9d-8c9a-e506-926d8537cada@nasa.gov>

Sounds like celebratory cake is in order for the users group in a few 
weeks ;)

On 9/6/17 7:30 PM, Glen Corneau wrote:
> Sorry I missed the anniversary of your conception ?(announcement letter) 
> back on August 26th, so I hope you'll accept my belated congratulations 
> on this long and exciting journey!
> 
> https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS297-318
> 
> I remember your parent, PIOFS, as well! ?Ahh the fun times.
> ------------------
> Glen Corneau
> Power Systems
> Washington Systems Center
> gcorneau at us.ibm.com
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From vpuvvada at in.ibm.com  Fri Sep  8 06:00:46 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:30:46 +0530
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
Message-ID: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be 
deleted by user using mmdelsnapshot command directly.  After recovery or 
resync completion they get deleted automatically. For peer snapshots 
deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p 
(undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com)


From:   "Malka, Janusz" <janusz.malka at desy.de>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I had similar issue, I had to recover connection to home 


From: "John Hearns" <john.hearns at asml.com>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it 
has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).
 
When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and 
cannot be deleted by user.
 
I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm
 
The advice of the gallery is sought, please.
 
 
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/defb469e/attachment-0002.htm>

From vpuvvada at in.ibm.com  Fri Sep  8 06:21:47 2017
From: vpuvvada at in.ibm.com (Venkateswara R Puvvada)
Date: Fri, 8 Sep 2017 10:51:47 +0530
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <HE1PR02MB14504BC62701452FAF3A014788940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB2074C221438C4A873D0FBFA3DF5E0B0E6859DF@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF77540C52.9CF7C33D-ON65258195.001D01AB-65258195.001D75FA@notes.na.collabserv.com>

mmafmconfig command should be run on the target path (path specified in 
the afmTarget option when fileset is created). If many filesets are 
sharing the same target (ex independent writer mode) , enable AFM once on 
target path. Run the command at home cluster.

mmafmconifg enable afmTarget 

~Venkat (vpuvvada at in.ibm.com)


From:   "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/07/2017 09:04 PM
Subject:        Re: [gpfsug-discuss] AFM from generic NFS share - 
mmafmconfig
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


I think you need to configure a gateway node (use mmchnode to change an 
existing node class to gateway) 
Then use mmafmconfig to setup export server maps on the gateway node.
 
e.g. 
mmafmconfig ?add ?mapping1? ?export-map ?nfsServerIP?/?GatewayNode? 
(double quotes not required)
 
mafmconfig show all
Map name:             mapping1
Export server map:    IP/GatewayNode
 
 
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 07 September 2017 16:15
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] AFM from generic NFS share - mmafmconfig
 
If I have an AFM setup where the home is located on a generic NFS share, 
let?s say  server:/volume/share
When I come ot set this up does it make sense to run  mmafmconfig on 
/volume/share ?
I can mount the share as a plain old NFS mount in order to run this 
operation, before I create the cache side fileset.
 
Apologies if I am being dumb as an ox here, and I deserve to be slapped in 
the face with a wet fish
https://en.wikipedia.org/wiki/The_Fish-Slapping_Dance
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the 
intended recipient(s). Any unauthorized review, use, disclosure or 
distribution is prohibited. Unless explicitly stated otherwise in the body 
of this communication or the attachment thereto (if any), the information 
is provided on an AS-IS basis without any express or implied warranties or 
liabilities. To the extent you are relying on this information, you are 
doing so at your own risk. If you are not the intended recipient, please 
notify the sender immediately by replying to this message and destroy all 
copies of this message and any attachments. Neither the sender nor the 
company/group of companies he or she represents shall be liable for the 
proper and complete transmission of the information contained in this 
communication, or for any delay in its receipt. 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=kKlSEJqmVE6q8Qt02JNaDLsewp13C0yRAmlfc_djRkk&s=JIbuXlCiReZx3ws5__6juuGC-sAqM74296BuyzgyNYg&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/67635871/attachment-0002.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:04:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:04:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <0CBB283A-A0A9-4FC9-A1CD-9E019D74CDB9@ocf.co.uk>

I am still populating your lot 2 response - it is split across 3 word docs and a whole heap of emails so easier for me to keep going - I dropped u off a lot of emails to save filling your inbox :-)

Could you poke around other tenders for the portal question please?

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From john.hearns at asml.com  Fri Sep  8 08:26:01 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 07:26:01 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
Message-ID: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/f6e9c311/attachment-0002.htm>

From gellis at ocf.co.uk  Fri Sep  8 08:33:51 2017
From: gellis at ocf.co.uk (Georgina Ellis)
Date: Fri, 8 Sep 2017 07:33:51 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
Message-ID: <93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> 
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
> 
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
> 
> I had similar issue, I had to recover connection to home 
> 
> 
> From: "John Hearns" <john.hearns at asml.com> 
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org> 
> Sent: Thursday, 7 September, 2017 17:52:19 
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot? 
> 
> 
> 
> Firmly lining myself up for a smack round the chops with a wet haddock? 
> 
> I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home) 
> 
> Mmdelfileset responds that : 
> 
> Fileset obfuscated has 1 fileset snapshot(s). 
> 
> 
> 
> When I try to delete the snapshot: 
> 
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user. 
> 
> 
> 
> I find this reference, which is about as useful as a wet haddock: 
> 
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ] 
> 
> 
> 
> The advice of the gallery is sought, please. 
> 
> 
> 
> 
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt. 
> _______________________________________________ 
> gpfsug-discuss mailing list 
> gpfsug-discuss at spectrumscale.org 
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
>    
> Content-Type: text/plain; charset="us-ascii"
> 
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************


From Sandra.McLaughlin at astrazeneca.com  Fri Sep  8 10:12:02 2017
From: Sandra.McLaughlin at astrazeneca.com (McLaughlin, Sandra M)
Date: Fri, 8 Sep 2017 09:12:02 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________

AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://www.astrazeneca.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/af4d206c/attachment-0002.htm>

From john.hearns at asml.com  Fri Sep  8 11:57:14 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 8 Sep 2017 10:57:14 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB14504373DC64177A95242B9588950@HE1PR02MB1450.eurprd02.prod.outlook.com>

Sandra,
   Thankyou for the help.  I have a support ticket outstanding, and will see what is suggested.
I am sure this is a simple matter of deleting the fileset as you say!

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b56be110/attachment-0002.htm>

From kkr at lbl.gov  Fri Sep  8 11:58:05 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Fri, 8 Sep 2017 03:58:05 -0700
Subject: [gpfsug-discuss] Hold the Date - Spectrum Scale Day @ HPCXXL
 (Sept 2017, NYC)
In-Reply-To: <6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
References: <A52508C1-582D-4024-9313-F06EE774D0E9@lbl.gov>
	<BF39E3E7-D384-460A-816F-F55CD0520DEE@lbl.gov>
	<6EF4187F-D8A1-4927-9E4F-4DF703DA04F5@lbl.gov>
Message-ID: <D4B42140-D3C3-4D95-B055-D988293972BC@lbl.gov>

Hello,

	The agenda for the GPFS Day during HPCXXL is fairly fleshed out here:

http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>

	See notes on registration below, which is free but required. Use the HPCXXL registration form, which has a $0 GPFS Day registration option.

	Hope to see some of you there.

Best,
Kristy


> On Aug 21, 2017, at 3:33 PM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> If you plan on attending the GPFS Day, please use the HPCXXL registration form (link to Eventbrite registration at the link below). The GPFS day is a free event, but you *must* register so we can make sure there are enough seats and food available. 
> 
> If you would like to speak or suggest a topic, please let me know.
> 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> The agenda is still being worked on, here are some likely topics:
> 
> --RoadMap/Updates
> --"New features - New Bugs? (Julich)
> --GPFS + Openstack (CSCS) 
> --ORNL Update on Spider3-related GPFS work
> --ANL Site Update
> --File Corruption Session
> 
> Best,
> Kristy
> 
>> On Aug 8, 2017, at 11:33 AM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>> 
>> Hello,
>> 
>> 	The GPFS Day of the HPCXXL conference is confirmed for Thursday, September 28th. Here is an updated URL, the agenda and registration are still being put together http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>. The GPFS Day will require registration, so we can make sure there is enough room (and coffee/food) for all attendees ?however, there will be no registration fee if you attend the GPFS Day only.
>> 
>> 	I?ll send another update when the agenda is closer to settled.
>> 
>> Cheers,
>> Kristy
>> 
>>> On Jul 7, 2017, at 3:32 PM, Kristy Kallback-Rose <kkr at lbl.gov <mailto:kkr at lbl.gov>> wrote:
>>> 
>>> Hello,
>>> 
>>>   More details will be provided as they become available, but just so you can make a placeholder on your calendar, there will be a Spectrum Scale Day the week of September 25th - 29th, likely Thursday, September 28th. 
>>> 
>>>   This will be a part of the larger HPCXXL meeting (https://www.spxxl.org/?q=New-York-City-2017 <https://www.spxxl.org/?q=New-York-City-2017>). You may recall this group was formerly called SPXXL and the website is in the process of transitioning to the new name (and getting a new certificate). You will be able to attend *just* the Spectrum Scale day if that is the only portion of the event you would like to attend. 
>>> 
>>>   The NYC location is great for Spectrum Scale events because many IBMers, including developers, can come in from Poughkeepsie.
>>> 
>>>   More as we get closer to the date and details are settled.
>>> 
>>> Cheers,
>>> Kristy
>> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/110fea2d/attachment-0002.htm>

From hpc.ken.tw25qn at gmail.com  Fri Sep  8 19:30:32 2017
From: hpc.ken.tw25qn at gmail.com (Ken Atkinson)
Date: Fri, 8 Sep 2017 19:30:32 +0100
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
References: <mailman.2204.1504818999.1126.gpfsug-discuss@spectrumscale.org>
	<93DCF805-F703-4ED5-A079-A44992A9268C@ocf.co.uk>
	<CAHu4YpUg3acFCkMSCyhVEY=eYLUVXx8bHojB-amuvhHti5P5Kg@mail.gmail.com>
Message-ID: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>

Not on too many G&Ts Georgina?
How are things.
Ken Atkinson

On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:

Apologies All, slip of the keyboard and not a comment on GPFS!

Sent from my iPhone

> On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org"
<gpfsug-discuss-request at spectrumscale.org> wrote:
>
> Send gpfsug-discuss mailing list submissions to
>    gpfsug-discuss at spectrumscale.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>    gpfsug-discuss-request at spectrumscale.org
>
> You can reach the person managing the list at
>    gpfsug-discuss-owner at spectrumscale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
>   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> From: "Malka, Janusz" <janusz.malka at desy.de>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> Content-Type: text/plain; charset="utf-8"
>
> I had similar issue, I had to recover connection to home
>
>
> From: "John Hearns" <john.hearns at asml.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Sent: Thursday, 7 September, 2017 17:52:19
> Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
>
>
>
> Firmly lining myself up for a smack round the chops with a wet haddock?
>
> I try to delete an AFM cache fileset which I create da few days ago (it
has an NFS home)
>
> Mmdelfileset responds that :
>
> Fileset obfuscated has 1 fileset snapshot(s).
>
>
>
> When I try to delete the snapshot:
>
> Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and
cannot be deleted by user.
>
>
>
> I find this reference, which is about as useful as a wet haddock:
>
> [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm |
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.
3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
>
>
>
> The advice of the gallery is sought, please.
>
>
>
>
>
>
> -- The information contained in this communication and any attachments is
confidential and may be privileged, and is for the sole use of the intended
recipient(s). Any unauthorized review, use, disclosure or distribution is
prohibited. Unless explicitly stated otherwise in the body of this
communication or the attachment thereto (if any), the information is
provided on an AS-IS basis without any express or implied warranties or
liabilities. To the extent you are relying on this information, you are
doing so at your own risk. If you are not the intended recipient, please
notify the sender immediately by replying to this message and destroy all
copies of this message and any attachments. Neither the sender nor the
company/group of companies he or she represents shall be liable for the
proper and complete transmission of the information contained in this
communication, or for any delay in its receipt.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/6da4c433/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 7 Sep 2017 21:16:34 +0000
> From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> To: gpfsug-discuss at spectrumscale.org
> Cc: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> Message-ID:
>    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.
0074DFDC at notes.na.collabserv.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
20170907/04778952/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> End of gpfsug-discuss Digest, Vol 68, Issue 13
> **********************************************
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/03c6bc78/attachment-0002.htm>

From aaron.s.knister at nasa.gov  Fri Sep  8 22:14:04 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Fri, 8 Sep 2017 17:14:04 -0400
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov>
	<OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
Message-ID: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>

Interesting! Thank you for the explanation.

This makes me wish GPFS had a client access model that more closely
mimicked parallel NAS, specifically for this reason. That then got me
wondering about pNFS support. I've not been able to find much about that
but in theory Ganesha supports pNFS. Does anyone know of successful pNFS
testing with GPFS and if so how one would set up such a thing?

-Aaron

On 08/25/2017 06:41 PM, IBM Spectrum Scale wrote:
>
> Hi Aaron,
>
> If cluster A uses the mmauth command to grant a file system read-only
> access to a remote cluster B, nodes on cluster B can only mount that
> file system with read-only access. But the only checking being done at
> the RPC level is the TLS authentication. This should prevent non-root
> users from initiating RPCs, since TLS authentication requires access
> to the local cluster's private key. However, a root user on cluster B,
> having access to cluster B's private key, might be able to craft RPCs
> that may allow one to work around the checks which are implemented at
> the file system level.
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks
> Forum at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please
> contact 1-800-237-5511 in the United States or your local IBM Service
> Center in other countries.
>
> The forum is informally monitored as time permits and should not be
> used for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for Aaron Knister ---08/21/2017 11:04:06 PM---Hi
> Everyone, I have a theoretical question about GPFS multiAaron Knister
> ---08/21/2017 11:04:06 PM---Hi Everyone, I have a theoretical question
> about GPFS multiclusters and security.
>
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date: 08/21/2017 11:04 PM
> Subject: [gpfsug-discuss] multicluster security
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
>
> ------------------------------------------------------------------------
>
>
>
> Hi Everyone,
>
> I have a theoretical question about GPFS multiclusters and security.
> Let's say I have clusters A and B. Cluster A is exporting a filesystem
> as read-only to cluster B.
>
> Where does the authorization burden lay? Meaning, does the security rely
> on mmfsd in cluster B to behave itself and enforce the conditions of the
> multi-cluster export? Could someone using the credentials on a
> compromised node in cluster B just start sending arbitrary nsd
> read/write commands to the nsds from cluster A (or something along those
> lines)? Do the NSD servers in cluster A do any sort of sanity or
> security checking on the I/O requests coming from cluster B to the NSDs
> they're serving to exported filesystems?
>
> I imagine any enforcement would go out the window with shared disks in a
> multi-cluster environment since a compromised node could just "dd" over
> the LUNs.
>
> Thanks!
>
> -Aaron
>
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=oK_bEPbjuD7j6qLTHbe7HM4ujUlpcNYtX3tMW2QC7_w&s=BliMQ0pToLIIiO1jfyUp2Q3icewcONrcmHpsIj_hMtY&e= 
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/1910cd49/attachment-0002.gif>

From oehmes at gmail.com  Fri Sep  8 22:21:00 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Fri, 08 Sep 2017 21:21:00 +0000
Subject: [gpfsug-discuss] mmfsd write behavior
In-Reply-To: <f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
References: <0f61621f-84d9-e249-0dd7-c1a4d50fea86@nasa.gov>
	<CALssuR0NEZrcW8A6W5d2agWJwCdCOimUChW-uqRp=4jXqZSPcQ@mail.gmail.com>
	<f83906a8-5f99-f27c-8ca4-af2921d4b596@nasa.gov>
Message-ID: <CALssuR2hvCFnoecUy4G42fDW6s6buQ5BHrNAd3y+xjQEO509qA@mail.gmail.com>

Hi,

the code assumption is that the underlying device has no volatile write
cache, i was absolute sure we have that somewhere in the FAQ, but i
couldn't find it, so i will talk to somebody to correct this.
if i understand
https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
correct
one could enforce this by setting REQ_FUA, but thats not explicitly set
today, at least i can't see it. i will discuss this with one of our devs
who owns this code and come back.

sven


On Thu, Sep 7, 2017 at 8:05 PM Aaron Knister <aaron.s.knister at nasa.gov>
wrote:

> Thanks Sven. I didn't think GPFS itself was caching anything on that
> layer, but it's my understanding that O_DIRECT isn't sufficient to force
> I/O to be flushed (e.g. the device itself might have a volatile caching
> layer). Take someone using ZFS zvol's as NSDs. I can write() all day log
> to that zvol (even with O_DIRECT) but there is absolutely no guarantee
> those writes have been committed to stable storage and aren't just
> sitting in RAM until an fsync() occurs (or some other bio function that
> causes a flush). I also don't believe writing to a SATA drive with
> O_DIRECT will force cache flushes of the drive's writeback cache..
> although I just tested that one and it seems to actually trigger a scsi
> cache sync. Interesting.
>
> -Aaron
>
> On 9/7/17 10:55 PM, Sven Oehme wrote:
> > I am not sure what exactly you are looking for but all blockdevices are
> > opened with O_DIRECT , we never cache anything on this layer .
> >
> >
> > On Thu, Sep 7, 2017, 7:11 PM Aaron Knister <aaron.s.knister at nasa.gov
> > <mailto:aaron.s.knister at nasa.gov>> wrote:
> >
> >     Hi Everyone,
> >
> >     This is something that's come up in the past and has recently
> resurfaced
> >     with a project I've been working on, and that is-- it seems to me as
> >     though mmfsd never attempts to flush the cache of the block devices
> its
> >     writing to (looking at blktrace output seems to confirm this). Is
> this
> >     actually the case? I've looked at the gpl headers for linux and I
> don't
> >     see any sign of blkdev_fsync, blkdev_issue_flush, WRITE_FLUSH, or
> >     REQ_FLUSH. I'm sure there's other ways to trigger this behavior that
> >     GPFS may very well be using that I've missed. That's why I'm asking
> :)
> >
> >     I figure with FPO being pushed as an HDFS replacement using commodity
> >     drives this feature has *got* to be in the code somewhere.
> >
> >     -Aaron
> >
> >     --
> >     Aaron Knister
> >     NASA Center for Climate Simulation (Code 606.2)
> >     Goddard Space Flight Center
> >     (301) 286-2776
> >     _______________________________________________
> >     gpfsug-discuss mailing list
> >     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
> >     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170908/b2985540/attachment-0002.htm>

From olaf.weiser at de.ibm.com  Sat Sep  9 09:05:31 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Sat, 9 Sep 2017 10:05:31 +0200
Subject: [gpfsug-discuss] multicluster security
In-Reply-To: <529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
References: <83936033-ce82-0a9b-3714-1dbea4c317db@nasa.gov><OF73F36214.E059F748-ON85258187.007C64D8-85258187.007CAF6A@notes.na.collabserv.com>
	<529f575b-eb11-a81e-2905-ab3237d41678@nasa.gov>
Message-ID: <OF2A0CBA52.519123DC-ONC1258196.002ABF0D-C1258196.002C73B4@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170909/9563931d/attachment-0002.gif>

From aaron.s.knister at nasa.gov  Mon Sep 11 01:43:56 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:43:56 -0400
Subject: [gpfsug-discuss] tuning parameters question
Message-ID: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>

Hi All (but mostly Sven),

I stumbled across this great gem:

files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf

and I'm wondering which, if any, of those tuning parameters are still 
relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
particularly ugly and the storage doesn't appear to be bottlenecked.

I see a lot of waiters like these:

Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'
Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
reason 'waiting for LX lock'

and I'm wondering if there's anything immediate one would suggest to 
help with that.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From aaron.s.knister at nasa.gov  Mon Sep 11 01:50:39 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Sun, 10 Sep 2017 20:50:39 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
Message-ID: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>

As an aside, my initial attempt was to use Ganesha via CES but the 
performance was significantly worse than CNFS for this workload. The 
docs seem to suggest that CNFS performs better for metadata intensive 
workloads which certainly seems to fit the bill here.

-Aaron

On 9/10/17 8:43 PM, Aaron Knister wrote:
> Hi All (but mostly Sven),
> 
> I stumbled across this great gem:
> 
> files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
> 
> and I'm wondering which, if any, of those tuning parameters are still 
> relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm 
> exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is 
> particularly ugly and the storage doesn't appear to be bottlenecked.
> 
> I see a lot of waiters like these:
> 
> Waiting 0.0009 sec since 20:41:31, monitored, thread 2881 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26231 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26146 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 18637 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25013 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 27879 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 26553 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25334 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> Waiting 0.0009 sec since 20:41:31, monitored, thread 25337 
> InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar), 
> reason 'waiting for LX lock'
> 
> and I'm wondering if there's anything immediate one would suggest to 
> help with that.
> 
> -Aaron
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From stefan.dietrich at desy.de  Mon Sep 11 08:40:14 2017
From: stefan.dietrich at desy.de (Dietrich, Stefan)
Date: Mon, 11 Sep 2017 09:40:14 +0200 (CEST)
Subject: [gpfsug-discuss] Switch from IPoIB connected mode to datagram with
	ESS 5.2.0?
Message-ID: <743361352.9211728.1505115614463.JavaMail.zimbra@desy.de>

Hello,

during reading the upgrade docs for ESS 5.2.0, I noticed a change in the IPoIB mode.
Now it specifies, that datagram (CONNECTED_MODE=no) instead of connected mode should be used.
All earlier versions used connected mode.

I am wondering about the reason for this change?
Or is this only relevant for bonded IPoIB interfaces?

Regards,
Stefan

--
------------------------------------------------------------------------
Stefan Dietrich            Deutsches Elektronen-Synchrotron (IT-Systems)
                        Ein Forschungszentrum der Helmholtz-Gemeinschaft
                                                            Notkestr. 85
phone:  +49-40-8998-4696                                   22607 Hamburg
e-mail: stefan.dietrich at desy.de                                  Germany
------------------------------------------------------------------------


From john.hearns at asml.com  Mon Sep 11 08:41:54 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 11 Sep 2017 07:41:54 +0000
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
In-Reply-To: <DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
References: <HE1PR02MB14503F1AD78BDFD31768AB2E88940@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<950421843.15715748.1504812216286.JavaMail.zimbra@desy.de>
	<OF3AA8987A.64BE0CC2-ON65258195.001AF90A-65258195.001B8934@notes.na.collabserv.com>
	<HE1PR02MB14507AC456DB8F0C5AF6A25088950@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<DB5PR04MB146341EC8D999DA6D15ADE79E1950@DB5PR04MB1463.eurprd04.prod.outlook.com>
Message-ID: <HE1PR02MB1450707CD4CB6EECBE7D1F0B88680@HE1PR02MB1450.eurprd02.prod.outlook.com>

Thankyou all for advice.
The ?-p? option was the fix here (thankyou to IBM support).


From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of McLaughlin, Sandra M
Sent: Friday, September 08, 2017 11:12 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

John,

I had a period when I had to delete and remake AFM filesets rather frequently ? this always worked for me:

mmunlinkfileset device fset -f
mmdelsnapshot device snapname  -j fset
mmdelfileset device fset -f

Sandra

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 08 September 2017 08:26
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Venkat, thankyou.  I have a support ticket open on this issue, and will keep this option handy!


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Venkateswara R Puvvada
Sent: Friday, September 08, 2017 7:01 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?

Snapshots created by AFM (recovery, resync or peer snapshots) cannot be deleted by user using mmdelsnapshot command directly.  After recovery or resync completion they get deleted automatically. For peer snapshots deletion  mmpsnasp command is used.  Which version of GPFS? Try with -p (undocumented) option.

mmdelsnapshot device snapname -j fset -p

~Venkat (vpuvvada at in.ibm.com<mailto:vpuvvada at in.ibm.com>)


From:        "Malka, Janusz" <janusz.malka at desy.de<mailto:janusz.malka at desy.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date:        09/08/2017 12:53 AM
Subject:        Re: [gpfsug-discuss] Deletion of a pcache snapshot?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________


I had similar issue, I had to recover connection to home
________________________________

From: "John Hearns" <john.hearns at asml.com<mailto:john.hearns at asml.com>>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Sent: Thursday, 7 September, 2017 17:52:19
Subject: [gpfsug-discuss] Deletion of a pcache snapshot?

Firmly lining myself up for a smack round the chops with a wet haddock?
I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
Mmdelfileset responds that :
Fileset  obfuscated has 1 fileset snapshot(s).

When I try to delete the snapshot:
Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.

I find this reference, which is about as useful as a wet haddock:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY_4.2.3%2Fcom.ibm.spectrum.scale.v4r23.doc%2F6027-3321.htm&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=60Hsf6r7HLExOcJ5diprosXgT%2BTc6Q36rNPqcrnJKyI%3D&reserved=0>

The advice of the gallery is sought, please.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=ahBWgzxAiBkveApyUeOWHlyvVx8UGNg0iywyr10ubIQ%3D&reserved=0>_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=VX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I&s=AmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI&e=<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3D92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A%26m%3DVX-Y-5GtodMzrpt1D71-1OOY3hu2UTJBg45sqxTHC8I%26s%3DAmQf6iZeaanuNkB3Ys2lR8Ajk-n2TUJ6Wbt2z2pnbKI%26e%3D&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce1770da61d37493e6ed308d4f6769543%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=EYI8%2F5ET2v%2B%2FUtMcoElsEPjYXhnmsc2iTKHtpSd7%2FQk%3D&reserved=0>


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.astrazeneca.com&data=01%7C01%7Cjohn.hearns%40asml.com%7C58685bf1633543fd590208d4f699af89%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=LfJIvno5VP%2B8rg%2F6zXQMzWa3tbREuBCRt8bnL%2FG13m8%3D&reserved=0>
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/c5cba04f/attachment-0002.htm>

From olaf.weiser at de.ibm.com  Mon Sep 11 09:11:15 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Mon, 11 Sep 2017 10:11:15 +0200
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
Message-ID: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/28008568/attachment-0002.htm>

From ed.swindelles at uconn.edu  Mon Sep 11 16:49:15 2017
From: ed.swindelles at uconn.edu (Swindelles, Ed)
Date: Mon, 11 Sep 2017 15:49:15 +0000
Subject: [gpfsug-discuss] UConn hiring GPFS administrator
Message-ID: <D1C812AF-5A0D-473C-91BC-A4F97764ED5B@uconn.edu>

The University of Connecticut is hiring three full time, permanent technical positions for its HPC team on the Storrs campus. One of these positions is focused on storage administration, including a GPFS cluster. I would greatly appreciate it if you would forward this announcement to contacts of yours who may have an interest in these positions. Here are direct links to the job descriptions and applications:

HPC Storage Administrator
http://s.uconn.edu/3tx

HPC Systems Administrator (2 positions to be filled)
http://s.uconn.edu/3tw

Thank you,

--
Ed Swindelles
Team Lead for Research Technology
University of Connecticut
860-486-4522

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170911/0c8c61d8/attachment-0002.htm>

From aaron.s.knister at nasa.gov  Mon Sep 11 23:15:10 2017
From: aaron.s.knister at nasa.gov (Aaron Knister)
Date: Mon, 11 Sep 2017 18:15:10 -0400
Subject: [gpfsug-discuss] tuning parameters question
In-Reply-To: <OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
References: <21830c8a-4a00-c75a-76aa-772c261c00eb@nasa.gov>
	<25e8995a-7b5b-05d3-016a-4941fb75dcd6@nasa.gov>
	<OF6F33289C.F9424249-ONC1258198.002C3BF7-C1258198.002CF9FB@notes.na.collabserv.com>
Message-ID: <9de64193-c60c-8ee1-b681-6cfe3993772b@nasa.gov>

Thanks, Olaf. I ended up un-setting a bunch of settings that are now 
auto-tuned (worker1threads, worker3threads, etc.) and just set 
workerthreads as you suggest. That combined with increasing 
maxfilestocache to above the max concurrent open file threshold of the 
workload got me consistently with in 1%-3% of the performance of the 
same storage hardware running btrfs instead of GPFS. I think that's 
pretty darned good considering the additional complexity GPFS has over 
btrfs of being a clustered filesystem. Plus I now get NFS server 
failover for very little effort and without having to deal with corosync 
or pacemaker.

-Aaron

On 9/11/17 4:11 AM, Olaf Weiser wrote:
> Hi Aaron ,
> 
> 0,0009 s response time for your meta data IO ... seems to be a very 
> good/fast storage BE.. which is hard to improve..
> you can raise the parallelism a bit for accessing metadata , but if this 
> will help to improve your "workload" is not assured
> 
> The worker3threads parameter specifies the number of threads to use for 
> inode prefetch. Usually , I would suggest, that you should not touch 
> single parameters any longer. By the great improvements of the last few 
> releases.. GPFS can calculate / retrieve the right settings 
> semi-automatically...
> You only need to set simpler "workerThreads" ..
> 
> But in your case , you can see, if this more specific value will help 
> you out .
> 
> depending on your blocksize and average filesize .. you may see 
> additional improvements when tuning nfsPrefetchStrategy , which tells 
> GPFS to consider all IOs wihtin */N/* blockboundaries as sequential ?and 
> starts prefetch
> 
> l.b.n.t. set ignoreprefetchLunCount to yes .. (if not already done) . 
> this helps GPFS to use all available workerThreads
> 
> cheers
> olaf
> 
> 
> 
> From: Aaron Knister <aaron.s.knister at nasa.gov>
> To: <gpfsug-discuss at spectrumscale.org>
> Date: 09/11/2017 02:50 AM
> Subject: Re: [gpfsug-discuss] tuning parameters question
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------------------------------------------------
> 
> 
> 
> As an aside, my initial attempt was to use Ganesha via CES but the
> performance was significantly worse than CNFS for this workload. The
> docs seem to suggest that CNFS performs better for metadata intensive
> workloads which certainly seems to fit the bill here.
> 
> -Aaron
> 
> On 9/10/17 8:43 PM, Aaron Knister wrote:
>  > Hi All (but mostly Sven),
>  >
>  > I stumbled across this great gem:
>  >
>  > files.gpfsug.org/presentations/2014/UG10_GPFS_Performance_Session_v10.pdf
>  >
>  > and I'm wondering which, if any, of those tuning parameters are still
>  > relevant with the 4.2.3 code. Specifically for a CNFS cluster. I'm
>  > exporting a gpfs fs as an NFS root to 1k nodes. The boot storm is
>  > particularly ugly and the storage doesn't appear to be bottlenecked.
>  >
>  > I see a lot of waiters like these:
>  >
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 2881
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26231
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26146
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 18637
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25013
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 27879
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 26553
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25334
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  > Waiting 0.0009 sec since 20:41:31, monitored, thread 25337
>  > InodePrefetchWorkerThread: on ThCond 0x1800635A120 (LkObjCondvar),
>  > reason 'waiting for LX lock'
>  >
>  > and I'm wondering if there's anything immediate one would suggest to
>  > help with that.
>  >
>  > -Aaron
>  >
> 
> -- 
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776


From zacekm at img.cas.cz  Tue Sep 12 10:40:35 2017
From: zacekm at img.cas.cz (Michal Zacek)
Date: Tue, 12 Sep 2017 11:40:35 +0200
Subject: [gpfsug-discuss] Wrong nodename after server restart
Message-ID: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>

Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)


From secretary at gpfsug.org  Tue Sep 12 15:22:41 2017
From: secretary at gpfsug.org (Secretary GPFS UG)
Date: Tue, 12 Sep 2017 15:22:41 +0100
Subject: [gpfsug-discuss] SS UG UK 2018
Message-ID: <c3a112a27d1dcdd9ec0e3cba8e11be1b@webmail.gpfsug.org>

 
Dear all, 

A date for your diary, #SSUG18 in the UK will be taking place on April
18th & 19th 2018. Please mark it in your diaries now! 

We'll confirm other details (venue, agenda etc.) nearer the time, but
the date is confirmed. 

Thanks, 
-- 

Claire O'Toole
Spectrum Scale/GPFS User Group Secretary
+44 (0)7508 033896
www.spectrumscaleug.org
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/5db4ccf8/attachment-0002.htm>

From scale at us.ibm.com  Tue Sep 12 16:01:21 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 12 Sep 2017 11:01:21 -0400
Subject: [gpfsug-discuss] Wrong nodename after server restart
In-Reply-To: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
References: <7e565699-b583-eeeb-c6b9-d11a39206331@img.cas.cz>
Message-ID: <OFD1EB20F8.CE480295-ON85258199.0052675E-85258199.0052859A@notes.na.collabserv.com>

Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
   GPFS cluster name:         gpfscl1.img.local
   GPFS cluster id:           17792677515884116443
   GPFS UID domain:           img.local
   Remote shell command:      /usr/bin/ssh
   Remote file copy command:  /usr/bin/scp
   Repository type:           CCR

  Node  Daemon node name       IP address       Admin node name 
Designation
----------------------------------------------------------------------------------
    1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local 
quorum-manager
    2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
    3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local 
quorum-manager
    4   tau.img.local          192.168.1.248 tau.img.local
    5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local 
quorum-manager
    6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local 
quorum-manager
    8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
   192.168.20.63   gpfs-n3
   192.168.20.61   gpfs-n1
   192.168.20.62   gpfs-n2
   192.168.1.248   tau
   192.168.20.64   gpfs-n4.img.local
   192.168.20.60   gpfs-quorum.img.local
   147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/aba30a31/attachment-0002.htm>

From daniel.kidger at uk.ibm.com  Tue Sep 12 16:36:06 2017
From: daniel.kidger at uk.ibm.com (Daniel Kidger)
Date: Tue, 12 Sep 2017 15:36:06 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 68, Issue 13
In-Reply-To: <CAHu4YpXer0gOKyEzq5+v1uYEiC3pLCHObcWuMpXW5ygm-XVC-A@mail.gmail.com>
Message-ID: <OF1C8113B1.DE118C7C-ON00258199.0055B3FB-1505230566579@notes.na.collabserv.com>

Well George is not the only one to have replied to the list with a one to one message.
?

Remember folks, this mailing list has a *lot* of people on it.
Hope my message is last that forgets who is in the 'To' field.

Daniel

Daniel Kidger 
Technical Sales Specialist, IBM UK
IBM Spectrum Storage Software
daniel.kidger at uk.ibm.com
+44 (0)7818 522266


> On 8 Sep 2017, at 19:30, Ken Atkinson <hpc.ken.tw25qn at gmail.com> wrote:
> 
> Not on too many G&Ts Georgina?
> How are things.
> Ken Atkinson
> 
> On 8 Sep 2017 08:33, "Georgina Ellis" <gellis at ocf.co.uk> wrote:
> Apologies All, slip of the keyboard and not a comment on GPFS!
> 
> Sent from my iPhone
> 
> > On 7 Sep 2017, at 22:16, "gpfsug-discuss-request at spectrumscale.org" <gpfsug-discuss-request at spectrumscale.org> wrote:
> >
> > Send gpfsug-discuss mailing list submissions to
> >    gpfsug-discuss at spectrumscale.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > or, via email, send a message with subject or body 'help' to
> >    gpfsug-discuss-request at spectrumscale.org
> >
> > You can reach the person managing the list at
> >    gpfsug-discuss-owner at spectrumscale.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of gpfsug-discuss digest..."
> >
> >
> > Today's Topics:
> >
> >   1. Re: Deletion of a pcache snapshot? (Malka, Janusz)
> >   2. Re: SMB2 leases - oplocks - growing files (Christof Schmitt)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Thu, 7 Sep 2017 21:23:36 +0200 (CEST)
> > From: "Malka, Janusz" <janusz.malka at desy.de>
> > To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> > Subject: Re: [gpfsug-discuss] Deletion of a pcache snapshot?
> > Message-ID: <950421843.15715748.1504812216286.JavaMail.zimbra at desy.de>
> > Content-Type: text/plain; charset="utf-8"
> >
> > I had similar issue, I had to recover connection to home
> >
> >
> > From: "John Hearns" <john.hearns at asml.com>
> > To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> > Sent: Thursday, 7 September, 2017 17:52:19
> > Subject: [gpfsug-discuss] Deletion of a pcache snapshot?
> >
> >
> >
> > Firmly lining myself up for a smack round the chops with a wet haddock?
> >
> > I try to delete an AFM cache fileset which I create da few days ago (it has an NFS home)
> >
> > Mmdelfileset responds that :
> >
> > Fileset obfuscated has 1 fileset snapshot(s).
> >
> >
> >
> > When I try to delete the snapshot:
> >
> > Snapshot obfuscated.afm.1234 is an internal pcache recovery snapshot and cannot be deleted by user.
> >
> >
> >
> > I find this reference, which is about as useful as a wet haddock:
> >
> > [ https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm | https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/6027-3321.htm ]
> >
> >
> >
> > The advice of the gallery is sought, please.
> >
> >
> >
> >
> >
> >
> > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/6da4c433/attachment-0001.html>
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 7 Sep 2017 21:16:34 +0000
> > From: "Christof Schmitt" <christof.schmitt at us.ibm.com>
> > To: gpfsug-discuss at spectrumscale.org
> > Cc: gpfsug-discuss at spectrumscale.org
> > Subject: Re: [gpfsug-discuss] SMB2 leases - oplocks - growing files
> > Message-ID:
> >    <OF71ED58D7.FDACCD9A-ON00258194.00748DC0-00258194.0074DFDC at notes.na.collabserv.com>
> >
> > Content-Type: text/plain; charset="us-ascii"
> >
> > An HTML attachment was scrubbed...
> > URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170907/04778952/attachment.html>
> >
> > ------------------------------
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> > End of gpfsug-discuss Digest, Vol 68, Issue 13
> > **********************************************
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170912/65278177/attachment-0002.htm>

From jonathan.mills at nasa.gov  Tue Sep 12 17:06:23 2017
From: jonathan.mills at nasa.gov (Jonathan Mills)
Date: Tue, 12 Sep 2017 12:06:23 -0400 (EDT)
Subject: [gpfsug-discuss] Support for SLES 12 SP3
Message-ID: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>

SLES 12 SP3 has been released.  And for what it?s worth, there does not 
appear to be substantial changes in either kernel or glibc as compared to 
SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, 
while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I 
wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported 
version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this 
fall would prefer to do so on the latest rev of our OS, as opposed to one 
that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov
NASA GSFC / NCCS HPC (606.2)
Bldg 28, Rm. S230 / c. 252-412-5710

From Greg.Lehmann at csiro.au  Wed Sep 13 00:12:55 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Tue, 12 Sep 2017 23:12:55 +0000
Subject: [gpfsug-discuss] Support for SLES 12 SP3
In-Reply-To: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
References: <alpine.DEB.2.02.1709121204190.26336@calvin1.nccs.nasa.gov>
Message-ID: <67f390a558244c41b154a7a6a9e5efe8@exch1-cdc.nexus.csiro.au>

+1. We are interested in SLES 12 SP3 too. 

BTW had anybody done any comparisons of SLES 12 SP2 (4.4) kernel vs RHEL 7.3 in terms of GPFS IO performance? I would think the 4.4 kernel might give it an edge. I'll probably get around to comparing them myself one day, but if anyone else has some numbers...

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Mills
Sent: Wednesday, 13 September 2017 2:06 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Support for SLES 12 SP3

SLES 12 SP3 has been released.  And for what it?s worth, there does not appear to be substantial changes in either kernel or glibc as compared to SLES 12 SP2.  In fact, the latest SLES 12 SP2 kernel is ?4.4.74-92.29?, while the initial SLES 12 SP3 kernel is ?4.4.73-5.1?.  Given this, I wanted to ask the team at IBM:

1) have you begun looking into SLES 12 SP3 yet?
2) if so, do you have any idea when you might release a fully supported version of Spectrum Scale for SLES 12 SP3?

Those of us who run SLES and are looking to deploy new infrastructure this fall would prefer to do so on the latest rev of our OS, as opposed to one that is already on life support...

--
Jonathan Mills / jonathan.mills at nasa.gov NASA GSFC / NCCS HPC (606.2) Bldg 28, Rm. S230 / c. 252-412-5710


From scale at us.ibm.com  Wed Sep 13 22:33:30 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 13 Sep 2017 17:33:30 -0400
Subject: [gpfsug-discuss] Fw:  Wrong nodename after server restart
Message-ID: <OFA4664B84.05767CE8-ON8525819A.00764969-8525819A.00766C39@us.ibm.com>

----- Forwarded by Eric Agar/Poughkeepsie/IBM on 09/13/2017 05:32 PM -----

From:   IBM Spectrum Scale/Poughkeepsie/IBM
To:     Michal Zacek <zacekm at img.cas.cz>
Date:   09/13/2017 05:29 PM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart
Sent by:        Eric Agar


Hello Michal,

It should not be necessary to delete whale.img.cas.cz and rename it.  But, 
that is an option you can take, if you prefer it. If you decide to take 
that option, please see the last paragraph of this response.

The confusion starts at the moment a node is added to the active cluster 
where the new node does not have the same common domain suffix as the 
nodes that were already in the cluster.  The confusion increases when the 
GPFS daemons on some nodes, but not all nodes, are recycled.  Doing 
mmshutdown -a, followed by mmstartup -a, once after the new node has been 
added allows all GPFS daemons on all nodes to come up at the same time and 
arrive at the same answer to the question, "what is the common domain 
suffix for all the nodes in the cluster now?"  In the case of your 
cluster, the answer will be "the common domain suffix is the empty string" 
or, put another way, "there is no common domain suffix"; that is okay, as 
long as all the GPFS daemons come to the same conclusion.

After you recycle the cluster, you can check to make sure all seems well 
by running "tsctl shownodes up" on every node, and make sure the answer is 
correct on each node.

If the mmshutdown -a / mmstartup -a recycle works, the problem should not 
recur with the current set of nodes in the cluster.  Even as individual 
GPFS daemons are recycled going forward, they should still understand the 
cluster's nodes have no common domain suffix.

However, I can imagine sequences of events that would cause the issue to 
occur again after nodes are deleted or added to the cluster while the 
cluster is active.  For example, if whale.img.cas.cz were to be deleted 
from the current cluster, that action would restore the cluster to having 
a common domain suffix of ".img.local", but already running GPFS daemons 
would not realize it.  If the delete of whale occurred while the cluster 
was active, subsequent recycling of the GPFS daemon on just a subset of 
the nodes would cause the recycled daemons to understand the common domain 
suffix to now be ".img.local".  But, daemons that had not been recycled 
would still think there is no common domain suffix.  The confusion would 
occur again.

On the other hand, adding and deleting nodes to/from the cluster should 
not cause the issue to occur again as long as the cluster continues to 
have the same (in this case, no) common domain suffix.

If you decide to delete whale.img.case.cz, rename it to have the 
".img.local" domain suffix, and add it back to the cluster, it would be 
best to do so after all the GPFS daemons are shut down with mmshutdown -a, 
but before any of the daemons are restarted with mmstartup.  This would 
allow all the subsequent running daemons to come to the conclusion that 
".img.local" is now the common domain suffix.

I hope this helps.

Regards,
Eric Agar

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Michal Zacek <zacekm at img.cas.cz>
To:     IBM Spectrum Scale <scale at us.ibm.com>
Date:   09/13/2017 03:42 AM
Subject:        Re: [gpfsug-discuss] Wrong nodename after server restart


Hello
yes you are correct, Whale was added two days a go. It's necessary to 
delete whale.img.cas.cz from cluster before mmshutdown/mmstartup? If the 
two domains may cause problems in the future I can rename whale (and all 
planed nodes) to img.local suffix.
Many thanks for the prompt reply. 
Regards
Michal

Dne 12.9.2017 v 17:01 IBM Spectrum Scale napsal(a):
Michal,

When a node is added to a cluster that has a different domain than the 
rest of the nodes in the cluster, the GPFS daemons running on the various 
nodes can develop an inconsistent understanding of what the common suffix 
of all the domain names are.  The symptoms you show with the "tsctl 
shownodes up" output, and in particular the incorrect node names of the 
two nodes you restarted, as seen on a node you did not restart, are 
consistent with this problem.  I also note your cluster appears to have 
the necessary pre-condition to trip on this problem, whale.img.cas.cz does 
not share a common suffix with the other nodes in the cluster.  The common 
suffix of the other nodes in the cluster is ".img.local".  Was 
whale.img.cas.cz recently added to the cluster?

Unfortunately, the general work-around is to recycle all the nodes at 
once: mmshutdown -a, followed by mmstartup -a.

I hope this helps.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
 1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:        Michal Zacek <zacekm at img.cas.cz>
To:        gpfsug-discuss at spectrumscale.org
Date:        09/12/2017 05:41 AM
Subject:        [gpfsug-discuss] Wrong nodename after server restart
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi,

I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and 
after that I was unable to move CES IP address back with strange error 
"mmces address move: GPFS is down on this node". After I double checked 
that gpfs state is active on all nodes, I dug deeper and I think I found 
problem, but I don't really know how this could happen.

Look at the names of nodes:

[root at gpfs-n2 ~]# mmlscluster     # Looks good

GPFS cluster information
========================
  GPFS cluster name:         gpfscl1.img.local
  GPFS cluster id:           17792677515884116443
  GPFS UID domain:           img.local
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name       IP address       Admin node name        
Designation
----------------------------------------------------------------------------------
   1   gpfs-n4.img.local      192.168.20.64 gpfs-n4.img.local      
quorum-manager
   2   gpfs-quorum.img.local  192.168.20.60 gpfs-quorum.img.local  quorum
   3   gpfs-n3.img.local      192.168.20.63 gpfs-n3.img.local      
quorum-manager
   4   tau.img.local          192.168.1.248 tau.img.local
   5   gpfs-n1.img.local      192.168.20.61 gpfs-n1.img.local      
quorum-manager
   6   gpfs-n2.img.local      192.168.20.62 gpfs-n2.img.local      
quorum-manager
   8   whale.img.cas.cz       147.231.150.108 whale.img.cas.cz


[root at gpfs-n2 ~]# mmlsmount gpfs01 -L   # not so good

File system gpfs01 is mounted on 7 nodes:
  192.168.20.63   gpfs-n3
  192.168.20.61   gpfs-n1
  192.168.20.62   gpfs-n2
  192.168.1.248   tau
  192.168.20.64   gpfs-n4.img.local
  192.168.20.60   gpfs-quorum.img.local
  147.231.150.108 whale.img.cas.cz

[root at gpfs-n2 ~]# tsctl shownodes up | tr ','  '\n'   # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local

The "tsctl shownodes up" is the reason why I'm not able to move CES 
address back to gpfs-n4 node, but the real problem are different 
nodenames. I think OS is configured correctly:

[root at gpfs-n4 /]# hostname
gpfs-n4

[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local

[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local

[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64    gpfs-n4.img.local gpfs-n4

[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64

[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.

Can someone help me with this.

Thanks,
Michal

p.s.  gpfs version: 4.2.3-2 (CentOS 7)
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=l_sz-tPolX87WmSf2zBhhPpggnfQJKp7-BqV8euBp7A&s=XSPGkKRMza8PhYQg8AxeKW9cOTNeCI9uph486_6Xajo&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 

Michal ???ek | Information Technologies 
+420 296 443 128 
+420 296 443 333 
michal.zacek at img.cas.cz 
www.img.cas.cz 
Institute of Molecular Genetics of the ASCR, v. v. i., V?de?sk? 1083, 142 
20 Prague 4, Czech Republic 
ID: 68378050 | VAT ID: CZ68378050 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1997 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170913/461b8b50/attachment-0002.png>

From valdis.kletnieks at vt.edu  Thu Sep 14 01:18:51 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Wed, 13 Sep 2017 20:18:51 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
Message-ID: <52657.1505348331@turing-police.cc.vt.edu>

So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?  	I could understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy that I
run that *does* have a big SHOW clause takes almost the same amount of time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?


From oehmes at gmail.com  Thu Sep 14 01:28:46 2017
From: oehmes at gmail.com (Sven Oehme)
Date: Thu, 14 Sep 2017 00:28:46 +0000
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <CALssuR3RMwe3tqUtYSViZtc20Q5oJOPtw=QtYO+KgpGJCeNhTw@mail.gmail.com>

can you please share the entire command line you are using ?
also gpfs version, mmlsconfig output would help as well as if this is a
shared storage filesystem or a system using local disks.

thx. Sven

On Wed, Sep 13, 2017 at 5:19 PM <valdis.kletnieks at vt.edu> wrote:

> So we have a number of very similar policy files that get applied for file
> migration etc. And they vary drastically in the runtime to process,
> apparently
> due to different selections on whether to do the work in parallel.
>
> Running a set of rules with 'mmapplypolicy -I defer' that look like this:
>
> RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
> THRESHOLD(0,100,0)
> WEIGHT(FILE_SIZE)
> TO POOL 'VBI_FILES'
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
> evaluate the policy in 3 more minutes.
>
> However, this takes a bit over 30 minutes for the scan and another 20 for
> sorting and policy evaluation over the same set of filesets:
>
> RULE 'VBI_FILES_RULE' LIST 'pruned_files'
> THRESHOLD(90,80)
> WEIGHT(FILE_SIZE)
> FOR FILESET('vbi')
> WHERE (mb_allocated >= 8)
>
> even though the output is essentially identical.  Why is LIST so much more
> expensive than 'MIGRATE" with '-I defer'?       I could understand if I
> had an
> expensive SHOW clause, but there isn't one here (and a different policy
> that I
> run that *does* have a big SHOW clause takes almost the same amount of
> time as
> the minimal LIST)....
>
> I'm thinking that it has *something* to do with the MIGRATE job outputting:
>
> [I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0
> files scanned.
> (...)
> [I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0
> records scanned.
>
> while the LIST job says:
>
> [I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
> (...)
> [I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.
>
> (Both output the same message during the 'Directory entries scanned: 0.'
> phase, but I suspect MIGRATE is multi-threading that part as well, as it
> completes much faster).
>
> What's the controlling factor in mmapplypolicy's decision whether or
> not to parallelize the policy?
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/7454f8c3/attachment-0002.htm>

From kh.atmane at gmail.com  Thu Sep 14 13:49:55 2017
From: kh.atmane at gmail.com (atmane)
Date: Thu, 14 Sep 2017 13:49:55 +0100
Subject: [gpfsug-discuss] Disk change problem in gss GNR
Message-ID: <op.y6j29hzdpgw25x@pc-atm>

dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


-- 
-- 
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :  
a.khiredine at meteo.dz


From makaplan at us.ibm.com  Thu Sep 14 19:55:39 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 14:55:39 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <52657.1505348331@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu>
Message-ID: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>

Read the doc again.  Specify both -g and -N options on the command line to 
get fully parallel directory and inode/policy scanning.

I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ... 
Perhaps premigrate everything (that matches the other conditions)?

You are correct about
I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

If you don't see messages like that, you did not specify both -N and -g.


From:   valdis.kletnieks at vt.edu
To:     gpfsug-discuss at spectrumscale.org
Date:   09/13/2017 08:19 PM
Subject:        [gpfsug-discuss] mmapplypolicy run time weirdness..
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, 
apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?                I could 
understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy 
that I
run that *does* have a big SHOW clause takes almost the same amount of 
time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job 
outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=SGbwD3m5mZ16_vwIFK8Ym48lwdF1tVktnSao0a_tkfA&s=sLt9AtZiZ0qZCKzuQoQuyxN76_R66jfAwQxdIY-w2m0&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/fe03e5b6/attachment-0002.htm>

From valdis.kletnieks at vt.edu  Thu Sep 14 21:09:40 2017
From: valdis.kletnieks at vt.edu (valdis.kletnieks at vt.edu)
Date: Thu, 14 Sep 2017 16:09:40 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
References: <52657.1505348331@turing-police.cc.vt.edu>
	<OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
Message-ID: <26551.1505419780@turing-police.cc.vt.edu>

On Thu, 14 Sep 2017 14:55:39 -0400, "Marc A Kaplan" said:

> Read the doc again.  Specify both -g and -N options on the command line to
> get fully parallel directory and inode/policy scanning.

Yeah, figured that out, with help from somebody. :)

> I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ...
> Perhaps premigrate everything (that matches the other conditions)?

Yeah, it's actually feeding to LTFS/EE - where we premigrate everything that matches to tape.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/a3a0be57/attachment-0002.sig>

From makaplan at us.ibm.com  Thu Sep 14 22:13:59 2017
From: makaplan at us.ibm.com (Marc A Kaplan)
Date: Thu, 14 Sep 2017 17:13:59 -0400
Subject: [gpfsug-discuss] mmapplypolicy run time weirdness..
In-Reply-To: <26551.1505419780@turing-police.cc.vt.edu>
References: <52657.1505348331@turing-police.cc.vt.edu><OFC7B1D160.959EF670-ON8525819B.0067ADCF-8525819B.0067FAAD@notes.na.collabserv.com>
	<26551.1505419780@turing-police.cc.vt.edu>
Message-ID: <OF2DE86920.DAFFA479-ON8525819B.00744BD2-8525819B.0074A50A@notes.na.collabserv.com>

BTW - we realize that mmapplypolicy -g and -N is a "gotcha" for some 
(many?) customer/admins -- so we're considering ways to make that easier 
-- but without "breaking" scripts and callbacks and what-have-yous that 
might depend on the current/old defaults...  Always a balancing act -- 
considering that GPFS ne Spectrum Scale just hit its 20th birthday (by IBM 
reckoning)

--marc of GPFS 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/5cd7446b/attachment-0002.htm>

From neil.wilson at metoffice.gov.uk  Fri Sep 15 11:47:19 2017
From: neil.wilson at metoffice.gov.uk (Wilson, Neil)
Date: Fri, 15 Sep 2017 10:47:19 +0000
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Message-ID: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>

Hi,

Does anyone know how to use "mmperfmon config update" to get the "hostname =" field in the ZImonSensors.cfg file populated with the hostname of the node that it's been installed on?
By default the field is empty and for some reason on our cluster it doesn't transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk<mailto:neil.wilson at metoffice.gov.uk>   Website www.metoffice.gov.uk<http://www.metoffice.gov.uk/>
Our magazine Barometer is now available online at http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7eede15d/attachment-0002.htm>

From john.hearns at asml.com  Fri Sep 15 16:37:13 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 15 Sep 2017 15:37:13 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
Message-ID: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170915/7cb2030f/attachment-0002.htm>

From Greg.Lehmann at csiro.au  Mon Sep 18 01:14:52 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Mon, 18 Sep 2017 00:14:52 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
In-Reply-To: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB14506381C97C76CEE8913A0C886C0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <5d1811f4d6ad4605bd2a7c7441f4dd1b@exch1-cdc.nexus.csiro.au>

I am interested too, so maybe keep it on list?

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: Saturday, 16 September 2017 1:37 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?

This is very probably off topic here..  I would be happy to get any responses off list.

My question is has anyone here set up NFS re-export / proxy with nfs-ganesha?

John Hearns

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/78f2c389/attachment-0002.htm>

From richard.lefebvre+gpfsug at calculquebec.ca  Mon Sep 18 20:16:57 2017
From: richard.lefebvre+gpfsug at calculquebec.ca (Richard Lefebvre)
Date: Mon, 18 Sep 2017 15:16:57 -0400
Subject: [gpfsug-discuss] How to find which node is generating high iops in
	a GPFS 3.5
Message-ID: <CAHuHHxpZNLFObOnBA06+vMSBzzh9gdGoZdBNTFLXx83+iqSeyw@mail.gmail.com>

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/e8ed6148/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 18 20:27:49 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 18 Sep 2017 19:27:49 +0000
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
Message-ID: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>

You do realize 3.5 is out of service, correct? You should be looking at upgrading :-)

Catching this is real time, when you have a large number of nodes is going to be tough. How you recognizing that the file system is overloaded? Waiters? Looking at which nodes/NSDs have the longest/largest waiters may provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its raw state, but it does provide some good stats on a per file system basis. But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that generate a lot of iops on the large file system but I cannot find the right tool to find which node is the source. I'm guessing under 4.2.X, there are now easy tools, but what can be done under GPFS 3.5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170918/3862d74b/attachment-0002.htm>

From scale at us.ibm.com  Tue Sep 19 07:47:42 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:47:42 +0800
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
Message-ID: <OF1C1CA31D.FDBB5467-ON482581A0.002535A5-482581A0.002553BB@notes.na.collabserv.com>

Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/15/2017 06:48 PM
Subject:	[gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson  Senior IT Practitioner
Storage Team   IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk   Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/35a6e81d/attachment-0002.gif>

From scale at us.ibm.com  Tue Sep 19 07:54:50 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 19 Sep 2017 14:54:50 +0800
Subject: [gpfsug-discuss] How to find which node is generating high iops
 in a GPFS 3.5
In-Reply-To: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
References: <39FB5D56-A8C4-47DA-8A56-A2E453724875@nuance.com>
Message-ID: <OF9FB0FAC8.E6E8D3A0-ON482581A0.0025AE68-482581A0.0025FAF5@notes.na.collabserv.com>

Hi Richard,

Is any of tool in
https://www.ibm.com/developerworks/community/wikis/home?_escaped_fragment_=/wiki/General%2520Parallel%2520File%2520System%2520%2528GPFS%2529/page/Display%2520per%2520node%2520IO%2520statstics
 can help you?

BTW, I agree with Bob that 3.5 is out-of-service. Without an extended
service, you should consider to upgrade your cluster as soon as possible.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/19/2017 03:28 AM
Subject:	Re: [gpfsug-discuss] How to find which node is generating high
            iops in a GPFS 3.5
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


You do realize 3.5 is out of service, correct? You should be looking at
upgrading :-)

Catching this is real time, when you have a large number of nodes is going
to be tough. How you recognizing that the file system is overloaded?
Waiters? Looking at which nodes/NSDs have the longest/largest waiters may
provide a clue.

You might also take a look at mmpmon ? it?s a bit difficult to use in its
raw state, but it does provide some good stats on a per file system basis.
But you need to track these over times to get what you need.

Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Richard
Lefebvre <richard.lefebvre+gpfsug at calculquebec.ca>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Monday, September 18, 2017 at 2:18 PM
To: gpfsug <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] How to find which node is generating
high iops in a GPFS 3.5

Hi I have a 3.5 GPFS system with 700+ nodes. I sometime have nodes that
generate a lot of iops on the large file system but I cannot find the right
tool to find which node is the source. I'm guessing under 4.2.X, there are
now easy tools, but what can be done under GPFS 3.5.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=AYwUf61wv-Hq63KU7veQSxavdZy-e9eT9bkJFav8MVU&s=W42AQE74bvmOlw7P0D0wTqT0Rxop4KktnXeuDeGGdmk&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/8ed1cd32/attachment-0002.gif>

From rohwedder at de.ibm.com  Tue Sep 19 08:42:46 2017
From: rohwedder at de.ibm.com (Markus Rohwedder)
Date: Tue, 19 Sep 2017 09:42:46 +0200
Subject: [gpfsug-discuss] ZIMON Sensors config files...
In-Reply-To: <OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
References: <DB2074C221438C4A873D0FBFA3DF5E0B0E688754@EXXCMPD1DAG2.cmpd1.metoffice.gov.uk>
	<OFF42914CD.EC8D4882-ON002581A0.00255C64@LocalDomain>
Message-ID: <OFB8D4EC66.B50F54BE-ON002581A0.002805B4-C12581A0.002A5E66@notes.na.collabserv.com>

Hello Neil,

While the description below provides a way on how to edit the hostname
parameter, you should not have the need to edit the "hostname" parameter.
Sensors use the hostname() call to get the hostname where the sensor is
running and use this as key in the performance database, which is what you
typically want to see.

From the description you provide I assume you want to have a sensor running
on every node that has the perfmon designation?
There could be different issues:
> In order to enable sensors on every node, you need to ensure there is no
"restrict" clause in the sensor description, or the restrict clause has to
be set correctly
> There could be some other communication issue between sensors and
collectors.
   Restart sensors and collectors and check the  logfiles
in /var/log/zimon/. You should be able to see which sensors start up and if
they can connect.
> Can you check if you have the perfmon designation set for the nodes where
you expect data from (mmlscluster)

Mit freundlichen Gr??en / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development
                                                                              
                                                                              
 Phone:            +49 7034 6430190      IBM Deutschland                      
                                                                              
 E-Mail:           rohwedder at de.ibm.com  Am Weiher 24                         
                                                                              
                                         65451 Kelsterbach                    
                                                                              
                                         Germany                              
                                                                              
                                                                              
 IBM Deutschland                                                              
 Research &                                                                   
 Development                                                                  
 GmbH /                                                                       
 Vorsitzender des                                                             
 Aufsichtsrats:                                                               
 Martina K?deritz                                                             
 Gesch?ftsf?hrung:                                                            
 Dirk Wittkopp                                                                
 Sitz der                                                                     
 Gesellschaft:                                                                
 B?blingen /                                                                  
 Registergericht:                                                             
 Amtsgericht                                                                  
 Stuttgart, HRB                                                               
 243294                                                                       
                                                                              

From:	"IBM Spectrum Scale" <scale at us.ibm.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc:	gpfsug-discuss-bounces at spectrumscale.org
Date:	09/19/2017 08:48 AM
Subject:	Re: [gpfsug-discuss] ZIMON Sensors config files...
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Neil,

Have you tried these steps?

mmperfmon config show --config-file /tmp/a
vi /tmp/a
mmperfmon config update --collectors oc8757286465 --config-file /tmp/a
mmperfmon config show


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.

Inactive hide details for "Wilson, Neil" ---09/15/2017 06:48:26 PM---Hi,
Does anyone know how to use "mmperfmon config update" "Wilson, Neil"
---09/15/2017 06:48:26 PM---Hi, Does anyone know how to use "mmperfmon
config update" to get the "hostname =" field in the ZImon

From: "Wilson, Neil" <neil.wilson at metoffice.gov.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 09/15/2017 06:48 PM
Subject: [gpfsug-discuss] ZIMON Sensors config files...
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Hi,

Does anyone know how to use ?mmperfmon config update? to get the ?hostname
=? field in the ZImonSensors.cfg file populated with the hostname of the
node that it?s been installed on?
By default the field is empty and for some reason on our cluster it doesn?t
transmit any metrics unless we put the node hostname into that field.
Is there some kind of wildcard that I can set?

Thanks
Neil

Neil Wilson Senior IT Practitioner
Storage Team IT Services
Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
Tel: +44 (0)1392 885959
Email: neil.wilson at metoffice.gov.uk Website www.metoffice.gov.uk
Our magazine Barometer is now available online at
http://www.metoffice.gov.uk/barometer/
P Please consider the environment before printing this e-mail. Thank you.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=JJA1q39zaRyjClihY50646c-CyY4ZvrmpSjR1qs5rTc&s=GWOiCpEHiZ_TqlFj0AeKmjcccnez-X2rHMa5UtvGPTk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=Ow2bpnoab1kboH2xuSUrbx65ALeoAAicG7csl1sV-Qc&s=qZ1XUXWfOayLSSuvcCyHQ2ZgY1mu0Zs3kmpgeVQUCYI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0006.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1D696444.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0007.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/21ef13b7/attachment-0008.gif>

From mnaineni at in.ibm.com  Tue Sep 19 12:50:50 2017
From: mnaineni at in.ibm.com (Malahal R Naineni)
Date: Tue, 19 Sep 2017 11:50:50 +0000
Subject: [gpfsug-discuss] NFS re-export with nfs-ganesha proxy?
	(Greg.Lehmann@csiro.au)
Message-ID: <OF60FA9DF0.C260AD81-ON002581A0.0040CA6E-002581A0.00411458@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f1d587f4/attachment-0002.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Tue Sep 19 22:02:03 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Tue, 19 Sep 2017 21:02:03 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/c6789dd1/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Wed Sep 20 00:39:37 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Tue, 19 Sep 2017 23:39:37 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
Message-ID: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>

OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/f4a15d88/attachment-0002.htm>

From bevans at pixitmedia.com  Wed Sep 20 02:21:36 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Tue, 19 Sep 2017 18:21:36 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
Message-ID: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>

Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced from
1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2 *
nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
with GID c081f9feff078a26. Please check if the correct inet6 address for
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same
way and no dice. Have tried a variety of drivers, cards, etc, same result
every time.

Cheers,
Barry

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170919/bc9116c8/attachment-0002.htm>

From scale at us.ibm.com  Wed Sep 20 04:07:18 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:07:18 +0800
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
Message-ID: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>

Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot
be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:


1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Oesterlin, Robert" <Robert.Oesterlin at nuance.com>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/20/2017 07:39 AM
Subject:	Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the
cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down,
bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Buterbaugh,
Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while
back.  I did nothing about it at the time because it was due to be
life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.
I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.
I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes
files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from
testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above,
networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine
cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do?  Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and
Education
Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633


 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/6269f48c/attachment-0002.gif>

From scale at us.ibm.com  Wed Sep 20 04:33:16 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 20 Sep 2017 11:33:16 +0800
Subject: [gpfsug-discuss] Disk change problem in gss GNR
In-Reply-To: <op.y6j29hzdpgw25x@pc-atm>
References: <op.y6j29hzdpgw25x@pc-atm>
Message-ID: <OF81362FD2.A45C682D-ON482581A1.0012A91B-482581A1.0013867A@notes.na.collabserv.com>


Hi Atmane,

In terms of this kind of disk management question, I would like to suggest
to open a PMR to make IBM service help you.

mmdelpdisk command would not need to reboot system to take effect.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	atmane <kh.atmane at gmail.com>
To:	"gpfsug-discuss at spectrumscale.org"
            <gpfsug-discuss at spectrumscale.org>
Date:	09/14/2017 08:50 PM
Subject:	[gpfsug-discuss] Disk change problem in gss GNR
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


dear all,

I change A Disk In Gss Storage Server

mmchcarrier BB1RGL --release --pdisk 'e1d1s02'
mmchcarrier BB1RGL --replace --pdisk 'e1d1s02'


after replace disk Now I Have 2 Discs In My Gss

the first disc was well changed name = "e1d1s02"

the second disk still
after I use this cmd


mmdelpdisk BB1RGL --pdisk e1d1s02#004 -a

the disk is still in use

i need to reboot the system or ??


mmlspdisk all | less

pdisk:
replacementPriority = 1000
name = "e1d1s02"
device = "/dev/sdik,/dev/sdih"
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "ok"
capacity  = 3000034656256
freeSpace = 1453846429696
fru = "00W1572"
location = "SV30820390-1-2"
WWN = "naa.5000C5008D783E37"
server = "gss0-ib0"

pdisk:
replacementPriority = 1000
name = "e1d1s02#004"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "missing/noPath/systemDrain/adminDrain/noRGD/noVCD"
capacity  = 3000034656256
freeSpace = 1599875317760
fru = "00W1572"
location = ""
WWN = "naa.5000C50056714E83"
server = "gss0-ib0"


--
--
Atmane Khiredine
HPC System Admin | Office National de la M?t?orologie
T?l : +213 21 50 73 93 Poste 303 | Fax : +213 21 50 79 40 | E-mail :
a.khiredine at meteo.dz
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIFbA&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=hQ86ctTaI7i14NrB-58_SzqSWnCR8p6b5bFxtzNcSbk&s=mthjH7ebhnNlSJl71hFjF4wZU0iygm3I9wH_Bu7_3Ds&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/5a7f3f55/attachment-0002.gif>

From olaf.weiser at de.ibm.com  Wed Sep 20 06:00:49 2017
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Wed, 20 Sep 2017 07:00:49 +0200
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
Message-ID: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/c94a9646/attachment-0002.htm>

From jonathon.anderson at colorado.edu  Wed Sep 20 06:13:13 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:13:13 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
Message-ID: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From jonathon.anderson at colorado.edu  Wed Sep 20 06:33:14 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 05:33:14 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>,
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <BN3PR03MB1382D3400D91A1B2448E920480610@BN3PR03MB1382.namprd03.prod.outlook.com>

I should have said, here are the package versions:


[root at sgate1 ~]# rpm -qa | grep gpfs
gpfs.gpl-4.2.2-3.noarch
gpfs.docs-4.2.2-3.noarch
gpfs.base-4.2.2-3.x86_64
gpfs.gplbin-3.10.0-514.26.2.el7.x86_64-4.2.2-3.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm32_2.el7.x86_64
gpfs.ext-4.2.2-3.x86_64
gpfs.msg.en_US-4.2.2-3.noarch
gpfs.gskit-8.0.50-57.x86_64
gpfs.gplbin-3.10.0-327.36.3.el7.x86_64-4.2.2-3.x86_64


________________________________________
From: Jonathon A Anderson
Sent: Tuesday, September 19, 2017 11:13:13 PM
To: gpfsug main discussion list
Cc: varun.mittal at in.ibm.com; Mark.Bush at siriuscom.com
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From gangqiu at cn.ibm.com  Wed Sep 20 06:58:15 2017
From: gangqiu at cn.ibm.com (Gang Qiu)
Date: Wed, 20 Sep 2017 13:58:15 +0800
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
Message-ID: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>

 Do you set ip address for these adapters?

Refer to the description of verbsRdmaCm in ?Command and Programming 
Reference':

If RDMA CM is enabled for a node, the node will only be able to establish 
RDMA connections
using RDMA CM to other nodes with verbsRdmaCm enabled. RDMA CM enablement 
requires
IPoIB (IP over InfiniBand) with an active IP address for each port. 
Although IPv6 must be
enabled, the GPFS implementation of RDMA CM does not currently support 
IPv6 addresses, so
an IPv4 address must be used.


Regards,
Gang Qiu

********************************************************************************************** 

IBM China Systems & Technology Lab
Tel:   86-10-82452193
Fax:   86-10-82452312
Moble: 132-6134-8284
Email:  gangqiu at cn.ibm.com
Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8 
Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193, 
P.R.China
??????????????8???????28???????????100193
**********************************************************************************************


From:   "Olaf Weiser" <olaf.weiser at de.ibm.com>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   09/20/2017 01:01 PM
Subject:        Re: [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


is ib_read_bw  working  ?
just test it between the two nodes ... 


From:        Barry Evans <bevans at pixitmedia.com>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        09/20/2017 03:21 AM
Subject:        [gpfsug-discuss] RoCE not playing ball
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi All,

Weirdness with a RoCE interface - verbs is not playing ball and is 
complaining about the inet6 address not matching up:

2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes 
verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes 
verbsRdmaUseCompVectors=yes
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version 
>= 1.1) loaded and initialized.
2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced 
from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 
2 * nspdQueues 1)).
2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981E1 state DOWN
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet 
0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2 
transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet 
0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort 
mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0 
with GID c081f9feff078a26. Please check if the correct inet6 address for 
the corresponding IP network interface is set
2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so 
unloaded.
2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid 
verbsPorts defined.


Anyone run into this before? I have another node imaged the *exact* same 
way and no dice. Have tried a variety of drivers, cards, etc, same result 
every time.

Cheers,
Barry


This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email._______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/54161a6e/attachment-0002.htm>

From tortay at cc.in2p3.fr  Wed Sep 20 09:03:54 2017
From: tortay at cc.in2p3.fr (Loic Tortay)
Date: Wed, 20 Sep 2017 10:03:54 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
References: <26ABC473-387D-4D58-9059-518E455724A9@vanderbilt.edu>
Message-ID: <853ffcf7-7900-457b-0d8a-2c63886ed245@cc.in2p3.fr>

On 19/09/2017 23:02, Buterbaugh, Kevin L wrote:
> Hi All,
> 
> We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively.  I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them.  I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do?  Thanks?
> 
Hello,
I have had the same issue multiple times.

The "trick" is to execute "/usr/lpp/mmfs/bin/mmcommon startCcrMonitor"
on a majority of quorum nodes (once they have the correct configuration
files) to be able to start the cluster.

I noticed a call to the above command in the "gpfs.gplbin" spec file in
the "%postun" section (when doing RPM upgrades, if I'm not mistaken).

<Insert here rant about CCR design & testing>.


Lo?c.
-- 
|   Lo?c Tortay <tortay at cc.in2p3.fr>  -     IN2P3 Computing Centre     |


From r.sobey at imperial.ac.uk  Wed Sep 20 09:23:37 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 20 Sep 2017 08:23:37 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for 
authentication. In that case the SMB service includes two aspects: One is 
the actual SMB file server, and the second one is the service for the 
Active Directory integration. Since NFS depends on authentication and id 
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From douglasof at us.ibm.com  Wed Sep 20 09:28:44 2017
From: douglasof at us.ibm.com (Douglas O'flaherty)
Date: Wed, 20 Sep 2017 08:28:44 +0000
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
Message-ID: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>

Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
 
   
    For more information 
    http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/
    
    Doug
  
  Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/2244bfe7/attachment-0002.htm>

From ckrafft at de.ibm.com  Wed Sep 20 11:47:35 2017
From: ckrafft at de.ibm.com (Christoph Krafft)
Date: Wed, 20 Sep 2017 12:47:35 +0200
Subject: [gpfsug-discuss] WANTED: Official support statement using Spectrum
 Scale 4.2.x with Oracle DB v12
Message-ID: <OF5272FF61.18327082-ON002581A1.003B0785-C12581A1.003B49D6@notes.na.collabserv.com>


Hi folks,

is anyone aware if there is now an official support statement for Spectrum
Scale 4.2.x?

As far as my understanding goes - we currently have an "older" official
support statement for v4.1 with Oracle.

Many thanks up-front for any useful hints ... :)


Mit freundlichen Gr??en / Sincerely

Christoph Krafft

Client Technical Specialist - Power Systems, IBM Systems
Certified IT Specialist @ The Open Group
                                                                                                               
                                                                                                               
 Phone:            +49 (0) 7034 643 2171                     IBM Deutschland GmbH                              
                                                                                                               
 Mobile:           +49 (0) 160 97 81 86 12                   Am Weiher 24                                      
                                                                                                               
 Email:            ckrafft at de.ibm.com                        65451 Kelsterbach                                 
                                                                                                               
                                                             Germany                                           
                                                                                                               
                                                                                                               
 IBM Deutschland                                                                                               
 GmbH /                                                                                                        
 Vorsitzender des                                                                                              
 Aufsichtsrats:                                                                                                
 Martin Jetter                                                                                                 
 Gesch?ftsf?hrung:                                                                                             
 Martina Koederitz                                                                                             
 (Vorsitzende),                                                                                                
 Norbert Janzen,                                                                                               
 Stefan Lutz,                                                                                                  
 Nicole Reimer,                                                                                                
 Dr. Klaus                                                                                                     
 Seifert, Wolfgang                                                                                             
 Wendt                                                                                                         
 Sitz der                                                                                                      
 Gesellschaft:                                                                                                 
 Ehningen /                                                                                                    
 Registergericht:                                                                                              
 Amtsgericht                                                                                                   
 Stuttgart, HRB                                                                                                
 14562 /                                                                                                       
 WEEE-Reg.-Nr. DE                                                                                              
 99369940                                                                                                      
                                                                                                               

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 15225079.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/88dc13f9/attachment-0005.gif>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 14:55:28 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 13:55:28 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
Message-ID: <FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD respectively).  Given that they were 12 year old boxes, we decided to replace them with other boxes that are a mere 7 years old ? keep in mind that this is a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire>'s password: vmp610.vampire:  Permission denied, please try again.

vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp612.vampire (10.0.21.12)' can't be established.
ECDSA key fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM.
ECDSA key fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp608.vampire (10.0.21.8)' can't be established.
ECDSA key fingerprint is SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw.
ECDSA key fingerprint is MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'vmp609.vampire (10.0.21.9)' can't be established.
ECDSA key fingerprint is SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q.
ECDSA key fingerprint is MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire (10.0.6.213)' can't be established.
ECDSA key fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE.
ECDSA key fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29.
Are you sure you want to continue connecting (yes/no)? root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
mmdsh: vmp610.vampire remote shell process had return code 255.
mmchcluster: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that were removed from the cluster without removing them from the cluster (done by another SysAdmin who was in a hurry to repurpose those machines).  They?re up and pingable but not reachable by GPFS anymore, which I?m pretty sure is making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale <scale at us.ibm.com<mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node died here. Are you mean that there are some hardware/OS issue which cannot be fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries.

The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and

From: "Oesterlin, Robert" <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM
Subject: Re: [gpfsug-discuss] CCR cluster down for the count?
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall) having to do with CCR and quorum. What I think you can do is set the cluster to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while back. I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3. In case it?s not obvious from the above, networking is fine ? ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=mBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y&s=YJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI&e=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/754a6e51/attachment-0002.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:17:34 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:17:34 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF35709F2C.3970A51E-ON002581A1.001B8F48@LocalDomain>
	<OF6644095A.616DDDA9-ON482581A1.00206053-482581A1.0020CCDA@notes.na.collabserv.com>
Message-ID: <CAE6+Ly5VppqizgNwC+niq1uwKJNhC9bSJMEwt9MqW28mYfqjEg@mail.gmail.com>

Yep, IP's set ok. We did try with ipv6 off to see what would happen, then
turned it back on again. There are ipv6 addresses on the cards, but ipv4 is
the only thing actually being used.


On Tue, Sep 19, 2017 at 10:58 PM, Gang Qiu <gangqiu at cn.ibm.com> wrote:

>
>
>
> Do you set ip address for these adapters?
>
> Refer to the description of verbsRdmaCm in ?Command and Programming
> Reference':
>
> If RDMA CM is enabled for a node, the node will only be able to establish
> RDMA connections
> using RDMA CM to other nodes with *verbsRdmaCm *enabled. RDMA CM
> enablement requires
> IPoIB (IP over InfiniBand) with an active IP address for each port.
> Although IPv6 must be
> enabled, the GPFS implementation of RDMA CM does not currently support
> IPv6 addresses, so
> an IPv4 address must be used.
>
>
>
> Regards,
> Gang Qiu
>
> ************************************************************
> **********************************
> IBM China Systems & Technology Lab
> Tel:   86-10-82452193
> Fax:   86-10-82452312
> Moble: 132-6134-8284
> Email:  gangqiu at cn.ibm.com
> Address: Ring Bldg. No.28 Building, Zhong Guan Cun Software Park, No. 8
> Dong Bei Wang West Road, ShangDi, Haidian District, Beijing 100193,
> P.R.China
> ??????????????8???????28???????????100193
> ************************************************************
> **********************************
>
>
>
> From:        "Olaf Weiser" <olaf.weiser at de.ibm.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 01:01 PM
> Subject:        Re: [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
>
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__pixitmedia.com_&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=Cuqio6URV5SlrAbObWAcbPH081odzTfHQKwjrXCoG60&e=>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.
> org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=
> NCthMXTjizwdEVDBqoDwAfRswiFbdQVHRb4mzseFLEM&m=
> u155tVFn5u91gqIsTXSOSVvpbR7GQRPoVpviUDH73R0&s=
> 63nY5ozD8mej1jefNBZjLGCkNOFD9-swr-lc7CRPbrM&e=
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/8962053d/attachment-0002.htm>

From bevans at pixitmedia.com  Wed Sep 20 15:23:21 2017
From: bevans at pixitmedia.com (Barry Evans)
Date: Wed, 20 Sep 2017 07:23:21 -0700
Subject: [gpfsug-discuss] RoCE not playing ball
In-Reply-To: <OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
References: <CAE6+Ly49ugoP-EtXrK+9d0LFkDkRfqMLTWx_yCEoo5Gd2HM28Q@mail.gmail.com>
	<OF32D2860E.A8344905-ONC12581A1.0019567E-C12581A1.001B8A64@notes.na.collabserv.com>
Message-ID: <CAE6+Ly69-NAHycT0+7YkMjcVB1hTRqO-gtMvn0FdUa7HpFfCuQ@mail.gmail.com>

It has worked, yes, and while the issue has been present. At the moment
it's not working, but I'm not entirely surprised with the amount it's been
poked at.

Cheers,
Barry

On Tue, Sep 19, 2017 at 10:00 PM, Olaf Weiser <olaf.weiser at de.ibm.com>
wrote:

> is ib_read_bw  working  ?
> just test it between the two nodes ...
>
>
>
>
> From:        Barry Evans <bevans at pixitmedia.com>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        09/20/2017 03:21 AM
> Subject:        [gpfsug-discuss] RoCE not playing ball
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Hi All,
>
> Weirdness with a RoCE interface - verbs is not playing ball and is
> complaining about the inet6 address not matching up:
>
> 2017-09-02_07:46:01.376+0100: [I] VERBS RDMA starting with verbsRdmaCm=yes
> verbsRdmaSend=no verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA library librdmacm.so (version
> >= 1.1) loaded and initialized.
> 2017-09-02_07:46:01.377+0100: [I] VERBS RDMA verbsRdmasPerNode reduced
> from 1000 to 514 to match (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2
> * nspdQueues 1)).
> 2017-09-02_07:46:01.382+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.383+0100: [I] VERBS RDMA discover mlx4_1 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.384+0100: [I] VERBS RDMA discover mlx4_1 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981E1 state DOWN
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x268A07FFFEF981C0 state ACTIVE
> 2017-09-02_07:46:01.385+0100: [I] VERBS RDMA discover mlx4_0 port 1
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFFAC106404 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[0] subnet
> 0xFE80000000000000 id 0x248A070001F981C1 state ACTIVE
> 2017-09-02_07:46:01.386+0100: [I] VERBS RDMA discover mlx4_0 port 2
> transport IB link ETH NUMA node  0 pkey[0] 0xFFFF gid[1] subnet
> 0x0000000000000000 id 0x0000FFFF0AC20011 state ACTIVE
> 2017-09-02_07:46:01.387+0100: [I] VERBS RDMA parse verbsPorts mlx4_0/1
> 2017-09-02_07:46:01.390+0100: [W] VERBS RDMA parse error   verbsPort
> mlx4_0/1   ignored due to interface not found for port 1 of device mlx4_0
> with GID c081f9feff078a26. Please check if the correct inet6 address for
> the corresponding IP network interface is set
> 2017-09-02_07:46:01.390+0100: [E] VERBS RDMA: rdma_get_cm_event err -1
> 2017-09-02_07:46:01.391+0100: [I] VERBS RDMA library librdmacm.so unloaded.
> 2017-09-02_07:46:01.391+0100: [E] VERBS RDMA failed to start, no valid
> verbsPorts defined.
>
>
> Anyone run into this before? I have another node imaged the *exact* same
> way and no dice. Have tried a variety of drivers, cards, etc, same result
> every time.
>
> Cheers,
> Barry
>
>
>
>
>
> <http://pixitmedia.com/>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other person.
> Please notify the sender immediately and delete this email from your
> computer system. Any opinions expressed are not necessarily those of the
> company from which this email was sent and, whilst to the best of our
> knowledge no viruses or defects exist, no responsibility can be accepted
> for any loss or damage arising from its receipt or subsequent use of this
> email._______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>

-- 
 <http://pixitmedia.com>
This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other person. 
Please notify the sender immediately and delete this email from your 
computer system. Any opinions expressed are not necessarily those of the 
company from which this email was sent and, whilst to the best of our 
knowledge no viruses or defects exist, no responsibility can be accepted 
for any loss or damage arising from its receipt or subsequent use of this 
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/7e91a2b9/attachment-0002.htm>

From kkr at lbl.gov  Wed Sep 20 17:00:15 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Wed, 20 Sep 2017 09:00:15 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
Message-ID: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>

Thanks Doug. 

If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.

Cheers,
Kristy

> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com> wrote:
> 
> 
> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
> 
> 
> For more information 
> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
> 
> Doug
> 
> Mobile
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/0f73131c/attachment-0002.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Wed Sep 20 17:27:48 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Wed, 20 Sep 2017 16:27:48 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920114844.6bf9f27b@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
Message-ID: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
testdellnode1:  total 12
testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testgateway:  total 12
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
testnsd3:  total 8
testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12
/var/mmfs/gen
root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
/var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
remote shell process had return code 255. testnsd1.vampire:  Host key
verification failed. mmdsh: testnsd1.vampire remote shell process had return
code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170920/3a626e67/attachment-0002.htm>

From stijn.deweirdt at ugent.be  Wed Sep 20 18:48:26 2017
From: stijn.deweirdt at ugent.be (Stijn De Weirdt)
Date: Wed, 20 Sep 2017 19:48:26 +0200
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <1f0b2657-8ca3-7b35-95f3-7c4edb6c0818@ugent.be>

hi kevin,

we were hit by similar issue when we did something not so smart: we had
a 5 node quorum, and we wanted to replace 1 test node with 3 more
production quorum node. we however first removed the test node, and then
with 4 quorum nodes we did mmshutdown for some other config
modifications. when we tried to start it, we hit the same "Not enough
CCR quorum nodes available" errors.

also, none of the ccr commands were helpful; they also hanged, even
simple ones like show etc etc.

what we did in the end was the following (and some try-and-error):

from the /var/adm/ras/mmsdrserv.log logfiles we guessed that we had some
sort of split brain paxos cluster (some reported " ccrd: recovery
complete (rc 809)", some same message with 'rc 0' and some didn't have
the recovery complete on the last line(s))

* stop ccr everywhere
mmshutdown -a
mmdsh -N all pkill -9 -f mmccr

* one by one, start the paxos cluster using mmshutdown on the quorum
nodes (mmshutdown will start ccr and there is no unit or something to
help with that).
 * the nodes will join after 3-4 minutes and report "recovery complete";
wait for it before you start another one

* the trial-and-error part was that sometimes there was recovery
complete with rc=809, sometimes with rc=0. in the end, once they all had
same rc=0, paxos was happy again and eg mmlsconfig worked again.


this left a very bad experience with CCR with us, but we want to use
ces, so no real alternative (and to be honest, with odd number of
quorum, we saw no more issues, everyting was smooth).

in particular we were missing
* unit files for all extra services that gpfs launched (mmccrmoniotr,
mmsysmon); so we can monitor and start/stop them cleanly
* ccr commands that work with broken paxos setup; eg to report that the
paxos cluster is broken or operating in some split-brain mode.

anyway, YMMV and good luck.

stijn


On 09/20/2017 06:27 PM, Buterbaugh, Kevin L wrote:
> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m including some output below which seems to show that I?ve got everything set up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this experience doesn?t make me eager to do so!), so I?m not that familiar with it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v grep" | sort
> testdellnode1:  root      2583     1  0 May30 ?        00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?        00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root     19356  4628  0 11:19 tty1     00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd1:  root      4628     1  0 Sep19 tty1     00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root     22149  2983  0 11:16 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd2:  root      2983     1  0 Sep18 ?        00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root     15685  6557  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testnsd3:  root      6557     1  0 Sep19 ?        00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?        00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?        00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort
> testdellnode1:  drwxr-xr-x 2 root root 4096 Mar  3  2017 cached
> testdellnode1:  drwxr-xr-x 2 root root 4096 Nov 10  2016 committed
> testdellnode1:  -rw-r--r-- 1 root root   99 Nov 10  2016 ccr.nodes
> testdellnode1:  total 12
> testgateway:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testgateway:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testgateway:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testgateway:  total 12
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 cached
> testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38 committed
> testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth
> testnsd1:  -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes
> testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached
> testnsd2:  drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.1
> testnsd2:  -rw------- 1 root root 4096 Sep 18 11:50 ccr.paxos.2
> testnsd2:  -rw-r--r-- 1 root root    0 Jun 29  2016 ccr.disks
> testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed
> testnsd3:  -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks
> testnsd3:  -rw-r--r-- 1 root root  4 Sep 19 15:41 ccr.noauth
> testnsd3:  -rw-r--r-- 1 root root 99 Jun 29  2016 ccr.nodes
> testnsd3:  total 8
> testsched:  drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed
> testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017 cached
> testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12
> /var/mmfs/gen
> root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testgateway:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> testsched:  -rw-r--r--. 1 root root 20360 Aug 25 17:43 /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/ssl/stage/genkeyData1"
> testnsd3:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testnsd2:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testsched:  ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> testnsd3.vampire:  Host key verification failed. mmdsh: testnsd3.vampire
> remote shell process had return code 255. testnsd1.vampire:  Host key
> verification failed. mmdsh: testnsd1.vampire remote shell process had return
> code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password: vmp610.vampire:
> Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>> To: gpfsug
> main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

From jonathon.anderson at colorado.edu  Wed Sep 20 19:55:04 2017
From: jonathon.anderson at colorado.edu (Jonathon A Anderson)
Date: Wed, 20 Sep 2017 18:55:04 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export is
> only to a set of machines in the same administrative domain that manages their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


From ewahl at osc.edu  Wed Sep 20 20:07:39 2017
From: ewahl at osc.edu (Edward Wahl)
Date: Wed, 20 Sep 2017 15:07:39 -0400
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
Message-ID: <20170920150739.39f0a4a0@osc.edu>


So who was the ccrmaster before? 
What is/was the quorum config?  (tiebreaker disks?) 

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu> wrote:

> Hi Ed,
> 
> Thanks for the suggestion ? that?s basically what I had done yesterday after
> Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
> including some output below which seems to show that I?ve got everything set
> up but it?s still not working.
> 
> Am I missing something?  We don?t use CCR on our production cluster (and this
> experience doesn?t make me eager to do so!), so I?m not that familiar with
> it...
> 
> Kevin
> 
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
> grep" | sort testdellnode1:  root      2583     1  0 May30 ?
> 00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testdellnode1:  root      6694  2583  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      2023  5828  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testgateway:  root      5828     1  0 Sep18 ?
> 00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root     19356  4628  0 11:19 tty1
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
> root      4628     1  0 Sep19 tty1
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root     22149  2983  0 11:16 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
> root      2983     1  0 Sep18 ?
> 00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root     15685  6557  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
> root      6557     1  0 Sep19 ?
> 00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root     29424  6512  0 11:19 ?
> 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
> testsched:  root      6512     1  0 Sep18 ?
> 00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
> 15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
> quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
> fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
> error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
> -F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
> drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
> root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
> root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
> drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
> 2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
> 99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
> root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
> committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
> testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
> -rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
> testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
> drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
> root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
> 4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
> 2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
> testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
> testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
> -rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
> root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
> 2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
> Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
> cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
> testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
> 3,0,10.0.6.215,,testnsd3.vampire
> 1,0,10.0.6.213,,testnsd1.vampire
> 2,0,10.0.6.214,,testnsd2.vampire
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
> testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
> testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
> testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
> testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
> 17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
> 25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
> testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
> /var/mmfs/gen
> root at testnsd2# mmdsh -F /tmp/cluster.hostnames
> "md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testdellnode1:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
> testgateway:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
> ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
> root at testnsd2#
> 
> On Sep 20, 2017, at 10:48 AM, Edward Wahl
> <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:
> 
> I've run into this before.  We didn't use to use CCR.  And restoring nodes for
> us is a major pain in the rear as we only allow one-way root SSH, so we have a
> number of useful little scripts to work around problems like this.
> 
> Assuming that you have all the necessary files copied to the correct
> places, you can manually kick off CCR.
> 
> I think my script does something like:
> 
> (copy the encryption key info)
> 
> scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/
> 
> scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/
> 
> scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/
> 
> <node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor
> 
> you should then see like 2 copies of it running under mmksh.
> 
> Ed
> 
> 
> On Wed, 20 Sep 2017 13:55:28 +0000
> "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> wrote:
> 
> Hi All,
> 
> testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
> respectively).  Given that they were 12 year old boxes, we decided to replace
> them with other boxes that are a mere 7 years old ? keep in mind that this is
> a test cluster.
> 
> Disabling CCR does not work, even with the undocumented ??force? option:
> 
> /var/mmfs/gen
> root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
> mmchcluster: Unable to obtain the GPFS configuration file lock.
> mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
> mmchcluster: Processing continues without lock protection.
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
> fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
> fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
> (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
> (10.0.21.12)' can't be established. ECDSA key fingerprint is
> SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
> MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: testnsd3.vampire:  Host key verification failed. mmdsh:
> testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
> Host key verification failed. mmdsh: testnsd1.vampire remote shell process
> had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
> vmp609.vampire remote shell process had return code 255. vmp608.vampire:
> Host key verification failed. mmdsh: vmp608.vampire remote shell process had
> return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
> vmp612.vampire remote shell process had return code 255.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password: vmp610.vampire: Permission denied, please try again.
> 
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255.
> 
> Verifying GPFS is stopped on all nodes ...
> The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
> ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
> ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
> Are you sure you want to continue connecting (yes/no)? The authenticity of
> host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
> fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
> fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
> sure you want to continue connecting (yes/no)? The authenticity of host
> 'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
> SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
> MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
> (10.0.21.9)' can't be established. ECDSA key fingerprint is
> SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
> MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
> continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
> (10.0.6.213)' can't be established. ECDSA key fingerprint is
> SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
> MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
> continue connecting (yes/no)?
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
> password:
> 
> testnsd3.vampire:  Host key verification failed.
> mmdsh: testnsd3.vampire remote shell process had return code 255.
> vmp612.vampire:  Host key verification failed.
> mmdsh: vmp612.vampire remote shell process had return code 255.
> vmp608.vampire:  Host key verification failed.
> mmdsh: vmp608.vampire remote shell process had return code 255.
> vmp609.vampire:  Host key verification failed.
> mmdsh: vmp609.vampire remote shell process had return code 255.
> testnsd1.vampire:  Host key verification failed.
> mmdsh: testnsd1.vampire remote shell process had return code 255.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied, please try again.
> vmp610.vampire:  Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
> remote shell process had return code 255. mmchcluster: Command failed.
> Examine previous error messages to determine cause. /var/mmfs/gen
> root at testnsd2#
> 
> I believe that part of the problem may be that there are 4 client nodes that
> were removed from the cluster without removing them from the cluster (done by
> another SysAdmin who was in a hurry to repurpose those machines).  They?re up
> and pingable but not reachable by GPFS anymore, which I?m pretty sure is
> making things worse.
> 
> Nor does Loic?s suggestion of running mmcommon work (but thanks for the
> suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
> start the cluster up failed:
> 
> /var/mmfs/gen
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /var/mmfs/gen
> root at testnsd2#
> 
> Thanks.
> 
> Kevin
> 
> On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
> <scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:
> 
> 
> Hi Kevin,
> 
> Let's me try to understand the problem you have. What's the meaning of node
> died here. Are you mean that there are some hardware/OS issue which cannot be
> fixed and OS cannot be up anymore?
> 
> I agree with Bob that you can have a try to disable CCR temporally, restore
> cluster configuration and enable it again.
> 
> Such as:
> 
> 1. Login to a node which has proper GPFS config, e.g NodeA
> 2. Shutdown daemon in all client cluster.
> 3. mmchcluster --ccr-disable -p NodeA
> 4. mmsdrrestore -a -p NodeA
> 5. mmauth genkey propagate -N testnsd1, testnsd3
> 6. mmchcluster --ccr-enable
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center in other
> countries.
> 
> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
> across this before, and it?s because of a bug (as I recall) having to do with
> CCR and
> 
> From: "Oesterlin, Robert"
> <Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
> To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
> the count? Sent by:
> gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>
> 
> ________________________________
> 
> 
> 
> OK ? I?ve run across this before, and it?s because of a bug (as I recall)
> having to do with CCR and quorum. What I think you can do is set the cluster
> to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
> up and then re-enable ccr.
> 
> I?ll see if I can find this in one of the recent 4.2 release nodes.
> 
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> From:
> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
> on behalf of "Buterbaugh, Kevin L"
> <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
> Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
> Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?
> 
> Hi All,
> 
> We have a small test cluster that is CCR enabled. It only had/has 3 NSD
> servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
> back. I did nothing about it at the time because it was due to be life-cycled
> as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down. So now
> resolving this has become higher priority? ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
> done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
> done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
> testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
> testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
> ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS ? or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I?ve got to run to a meeting right now, so I hope I?m not leaving out any
> crucial details here ? does anyone have an idea what I need to do? Thanks?
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
> - (615)875-9633
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at
> spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0
> 
> 
> 
> 
> --
> 
> Ed Wahl
> Ohio Supercomputer Center
> 614-292-9302
> 
> 
> 
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
> 
> 
> 


-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302


From tarak.patel at canada.ca  Wed Sep 20 21:23:00 2017
From: tarak.patel at canada.ca (Patel, Tarak (SSC/SPC))
Date: Wed, 20 Sep 2017 20:23:00 +0000
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu>
	<CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com>
	<OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com>
	<BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <mailman.5.1679913406.3671611.gpfsug-discuss_gpfsug.org@gpfsug.org>

Hi,

Recently we deployed 3 sets of CES nodes where we are using LDAP for authentication service. We had to create a user in ldap which was used by 'mmuserauth service create' command.  Note that SMB needs to be disabled ('mmces service disable smb') if not being used before issuing 'mmuserauth service create'.  By default, CES deployment enables SMB (' spectrumscale config protocols').

Tarak

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September, 2017 14:55
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

I shouldn't need SMB for authentication if I'm only using userdefined authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A <r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a blocking bugfix applied. I'm making it further now, but I'm getting an error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3 <varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated nod]Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive err]Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no authentication Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB service not enabled.
I am currently working on a spectrum scale cluster, and i dont have the SMB package, I am waiting for it.. is there a way to export NFSv3 using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works fine, I was able to mount the gpfs export from another machine.. this was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and 
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "* 
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but not 
> for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the 
> NFS client tells you".  This of course only works sanely if each NFS 
> export is only to a set of machines in the same administrative domain 
> that manages their UID/GIDs.  Exporting to two sets of machines that 
> don't coordinate their UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpi
> Bv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiy
> liSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ
> 0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGV
> srSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwC
> YeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbj
> XI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuv
> EeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discus
> s
>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://secure-web.cisco.com/1w-ldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-fNVoZ49ioTlOwQoRbyC_MjpoBPlD3jfpV_knuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM_jYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-Vs_qLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4_MtVXKzQRwQqemODDjSa5my7zl98vobN_ui-cRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-Cl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A/http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From chetkulk at in.ibm.com  Thu Sep 21 06:33:53 2017
From: chetkulk at in.ibm.com (Chetan R Kulkarni)
Date: Thu, 21 Sep 2017 11:03:53 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
References: <CAJUuSvFSmmgEzgT+ku_kHZhLAHUVefGTg1RqR3uZz1P2snr1vA@mail.gmail.com><27469.1500914134@turing-police.cc.vt.edu><CAJUuSvF8+vBZzzgvSnLS8oXJMMaR98wNAECUPdQi+TQh0rdaiQ@mail.gmail.com><OF5797C6D4.70719532-ON00258169.001430CD-65258169.00145DEB@LocalDomain>,
	<OF43434044.8592BCFA-ON00258169.00147625-65258169.00148B6E@notes.na.collabserv.com><BN3PR03MB1382892FBE9110DC3FF40BCC80610@BN3PR03MB1382.namprd03.prod.outlook.com>,
	<HE1PR0602MB3225EA7FEF7F0FEBD923AC5BDF610@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<BN3PR03MB1382716A1217732854ED7C3F80610@BN3PR03MB1382.namprd03.prod.outlook.com>
Message-ID: <OF1A4EBB73.EFDC10D4-ON652581A2.001E1BDB-652581A2.001E91B1@notes.na.collabserv.com>


Hi Jonathon,

I can configure file userdefined authentication with only NFS
enabled/running on my test setup (SMB was disabled).

Please check if following steps help fix your issue:

1> remove existing file auth if any
/usr/lpp/mmfs/bin/mmuserauth service remove --data-access-method file

2> disable smb service
/usr/lpp/mmfs/bin/mmces service disable smb
/usr/lpp/mmfs/bin/mmces service list -a

3> configure userdefined file auth
/usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined

4> if above fails retry mmuserauth in debug mode as below and please share
error log /tmp/userdefined.log. Also share spectrum scale version you are
running with.
export DEBUG=1; /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined > /tmp/userdefined.log 2>&1;
unset DEBUG
/usr/lpp/mmfs/bin/mmdiag --version

5> if mmuserauth succeeds in step 3> above; you also need to correct your
mmnfs cli command as below. You missed to type in Access_Type= and Squash=
in client definition.
mmnfs export add /gpfs/summit/scratch --client 'login*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash);dtn*.rc.int.colorado.edu
(Access_Type=rw,Squash=root_squash)'

Thanks,
Chetan.


From:	Jonathon A Anderson <jonathon.anderson at colorado.edu>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	09/21/2017 12:25 AM
Subject:	Re: [gpfsug-discuss] export nfs share on gpfs with no
            authentication
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Varun Mittal3
<varun.mittal at in.ibm.com>
Sent: Tuesday, July 25, 2017 9:44:24 PM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Sorry a small typo:
mmuserauth service create --data-access-method file --type userdefined


Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune

[Inactive hide details for Varun Mittal3---26/07/2017 09:12:27 AM---Hi Did
you try to run this command from a CES designated nod]Varun
Mittal3---26/07/2017 09:12:27 AM---Hi Did you try to run this command from
a CES designated node ?

From: Varun Mittal3/India/IBM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 26/07/2017 09:12 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

________________________________


Hi

Did you try to run this command from a CES designated node ?

If no, then try executing the command from a CES node:
mmuserauth service create --data-access-type file --type userdefined

Best regards,
Varun Mittal
Cloud/Object Scrum @ Spectrum Scale
ETZ, Pune


[Inactive hide details for Ilan Schwarts ---25/07/2017 10:22:26 AM---Hi,
While trying to add the userdefined auth, I receive err]Ilan Schwarts
---25/07/2017 10:22:26 AM---Hi, While trying to add the userdefined auth, I
receive error that SMB

From: Ilan Schwarts <ilan84 at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 25/07/2017 10:22 AM
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________


Hi,

While trying to add the userdefined auth, I receive error that SMB
service not enabled.
I am currently working on a spectrum scale cluster, and i dont have
the SMB package, I am waiting for it.. is there a way to export NFSv3
using the spectrum scale tools without SMB package ?
[root at LH20-GPFS1 ~]# mmuserauth service create --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error
messages to determine cause.


I exported the NFS via /etc/exports and than ./exportfs -a .. It works
fine, I was able to mount the gpfs export from another machine.. this
was my work-around since the spectrum scale tools failed to export
NFSv3

On Mon, Jul 24, 2017 at 7:35 PM,  <valdis.kletnieks at vt.edu> wrote:
> On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
>> Hi,
>> I have gpfs with 2 Nodes (redhat).
>> I am trying to create NFS share - So I would be able to mount and
>> access it from another linux machine.
>
>> While trying to create NFS (I execute the following):
>> [root at LH20-GPFS1 ~]# mmnfs export add /fs_gpfs01 -c "*
>> Access_Type=RW,Protocols=3:4,Squash=no_root_squash)"
>
> You can get away with little to no authentication for NFSv3, but
> not for NFSv4.  Try with Protocols=3 only and
>
> mmuserauth service create --type userdefined
>
> that should get you Unix-y NFSv3 UID/GID support and "trust what the NFS
> client tells you".  This of course only works sanely if each NFS export
is
> only to a set of machines in the same administrative domain that manages
their
> UID/GIDs.  Exporting to two sets of machines that don't coordinate their
> UID/GID space is, of course, where hilarity and hijinks ensue....
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

>


--


-
Ilan Schwarts
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1w-2Dldlm8bq5oYiMuHk7N1T32DW18VkxjnfkMWDjdpiBv1WJToz9PCO1zVyGvWIVP3-2DfNVoZ49ioTlOwQoRbyC-5FMjpoBPlD3jfpV-5FknuzViyRNZiyliSGH9rx5nGVvTLSPrjIwzvUIZDadCuNXgM-5FjYCVBE2RsDpg8o4LCjJv9QIZPbyHlKrkoQ0sNGXOZPYT7gxpo8sVjoxKQbOgQzkDnPMQoa2a8miTP19fLkB5HqV5cJv3U-2DVs-5FqLtyJGVsrSgLu2wQoDMxymVwm5mcRWO6MYfl4-5FMtVXKzQRwQqemODDjSa5my7zl98vobN-5Fui-2DcRwCYeVbOwEd57CjaYRzKcBu6Dbd2TmGar7JUNWVtg1dZPTv6uothD6V4g0Q0MuXZsBICzfxbjXI9WlB3Tiu3ty0oxenYrM8yxE-2DCl57VhmV4KlY18EHMFncfLtRkk9cTHtfrEjiXBROhCuvEeqhrYT6A_http-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=DNgplGZ30awqnvnd4Ju39pzv3rlk18Kf6NGe7iDX4Mk&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=VqyIekg3Wtz0ukw-QSXsEXOoi5rZ0gnMeIPyFNGpllA&s=AliY037R_W1y8Ym6nPI1XDP2yCq47JwtTPhj9IppwOM&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/70c1faaf/attachment-0002.gif>

From andreas.mattsson at maxiv.lu.se  Thu Sep 21 13:09:29 2017
From: andreas.mattsson at maxiv.lu.se (Andreas Mattsson)
Date: Thu, 21 Sep 2017 12:09:29 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
References: <accef64e0cde48968aeca7cb9883112a@maxiv.lu.se>
	<EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org>
	<bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>

Since I solved this old issue a long time ago, I'd thought I'd come back and report the solution in case someone else encounters similar problems in the future.


Original problem reported by users:

Copying files between folders on NFS exports from a CES server gave random timestamps on the files.  Also, apart from the initial reported problem, there where issues where users sometimes couldn't change or delete files that they where owners of.


Background:

We have a Active Directory with RFC2307 posix attributes populated, and use the built in Winbind-based AD authentication with RFC2307 ID mapping of our Spectrum Scale CES protocol servers.

All our Linux clients and servers are also AD integrated, using Nslcd and nss-pam-ldapd.


Trigger:

If a user was part of a AD group with a mixed case name, and this group gave access to a folder, and the NFS mount was done using NFSv4, the behavior in my original post occurred when copying or changing files in that folder.


Cause:

Active Directory handle LDAP-requests case insensitive, but results are returned with case retained.

Winbind and SSSD-AD converts groups and usernames to lower case. Nslcd retains case.

We run NFS with managed GIDs. Managed GIDs in NFSv3 seems to be handled case insensitive, or to ignore the actual group name after it has resolved the GID-number of the group, while NFSv4 seems to handle group names case sensitive and check the actual group name for certain operations even if the GID-number matches.

Don't fully understand the mechanism behind why certain file operations would work but others not, but in essence a user would be part of a group called "UserGroup" with GID-number 1234 in AD and on the client, but would be part of a group called "usergroup" with GID-number 1234 on the CES server.

Any operation that's authorized on the GID-number, or a case insensitive lookup of the group name, would work. Any operation authorized by a case sensitive group lookup would fail.


Three different workarounds where found to work:

1. Rename groups and users to lower case in AD

2. Change from Nslcd to either SSSD or Winbind on the clients

3. Change from NFSv4 to NFSv3 when mounting NFS


Remember to clear ID-mapping caches.


Regards,

Andreas

___________________________________
[https://mail.google.com/mail/u/0/?ui=2&ik=b0a6f02971&view=att&th=14618fab2daf0e10&attid=0.1.1&disp=emb&zw&atsh=1]<https://www.maxlab.lu.se>

Andreas Mattsson
System Engineer

MAX IV Laboratory
Lund University
Tel: +46-706-649544<tel:%2B46-706-649544>
E-mail: andreas.mattsson at maxlab.lu.se<mailto:andreas.mattsson at maxlab.lu.se>
<mailto:daniel.liikamaa at maxlab.lu.se>
________________________________
Fr?n: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> f?r Stephen Ulmer <ulmer at ulmer.org>
Skickat: den 3 februari 2017 14:35:21
Till: gpfsug main discussion list
?mne: Re: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES

Does the cp actually complete? As in, does it copy all of the blocks?  What?s the exit code?

A cp?d file should have  ?new? metadata. That is, it should have it?s own dates, owners, etc. (not necessarily copied from the source file).

I ran ?strace cp foo1 foo2?, and it was pretty instructive, maybe that would get you more info. On CentOS strace is in it?s own package, YMMV.

--
Stephen


On Feb 3, 2017, at 8:19 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

That works.

?touch test100?

Feb 3 14:16 test100

?cp test100 test101?

Feb 3 14:16 test100
Apr 21 2027 test101

?touch ?r test100 test101?

Feb 3 14:16 test100
Feb 3 14:16 test101

/Andreas


That?s a cool one. :)

What if you use the "random date" file as a time reference to touch another file (like, 'touch -r file02 file03?)?

--
Stephen


On Feb 3, 2017, at 7:46 AM, Andreas Mattsson <andreas.mattsson at maxiv.lu.se<mailto:andreas.mattsson at maxiv.lu.se>> wrote:

I?m having some really strange timestamp behaviour when doing file operations on NFS mounts shared via CES on spectrum scale 4.2.1.1
The NFS clients are up to date Centos and Debian machines.
All Scale servers and NFS clients have correct date and time via NTP.

Creating a file, for instance ?touch file00?, gives correct timestamp.
Moving the file, ?mv file00 file01?, gives correct timestamp
Copying the file, ?cp file01 file02?, gives a random timestamp anywhere in time, for instance Oct 12 2095 or Feb 29 1976 or something similar.

This is only via NFS. Copying the file via a native gpfs-mount or via SMB gives a correct timestamp.
Doing the same operation over NFS to other NFS-servers works correct, it is only when operating on the NFS-share from the Spectrum Scale CES the issue occurs.

Have anyone seen this before?

Regards,
Andreas Mattsson
_____________________________________________
<image001.png>

Andreas Mattsson
Systems Engineer

MAX IV Laboratory
Lund University
P.O. Box 118, SE-221 00 Lund, Sweden
Visiting address: Fotongatan 2, 225 94 Lund
Mobile: +46 706 64 95 44
andreas.mattsson at maxiv.se<mailto:andreas.mattsson at maxiv.se>
www.maxiv.se<http://www.maxiv.se/>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/fbe5e837/attachment-0002.htm>

From taylorm at us.ibm.com  Thu Sep 21 15:33:00 2017
From: taylorm at us.ibm.com (Michael L Taylor)
Date: Thu, 21 Sep 2017 07:33:00 -0700
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
Message-ID: <OFC2F832AE.B5685939-ON002581A1.008268B6-072581A2.004FECF6@notes.na.collabserv.com>


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com:  NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	09/20/2017 12:07 PM
Subject:	gpfsug-discuss Digest, Vol 68, Issue 42
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e=

or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: export nfs share on gpfs with no authentication
      (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
		 authentication
Message-ID:

<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>


Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a
blocking bugfix applied. I'm making it further now, but I'm getting an
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu
(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/4ab85c21/attachment-0002.htm>

From Kevin.Buterbaugh at Vanderbilt.Edu  Thu Sep 21 18:09:52 2017
From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L)
Date: Thu, 21 Sep 2017 17:09:52 +0000
Subject: [gpfsug-discuss] CCR cluster down for the count?
In-Reply-To: <20170920150739.39f0a4a0@osc.edu>
References: <D454298B-A106-43F2-A30A-832D36DE65EC@nuance.com>
	<OFEE344D44.E051FBDC-ON482581A1.000EC724-482581A1.001125F5@notes.na.collabserv.com>
	<FF56A2B6-7BAF-49E6-9F26-D8C327687B21@vanderbilt.edu>
	<20170920114844.6bf9f27b@osc.edu>
	<28D10363-A8F3-439B-81DB-EB0E4E750FFD@vanderbilt.edu>
	<20170920150739.39f0a4a0@osc.edu>
Message-ID: <A44350B7-4CEF-497A-9D41-0C1A96B0F103@vanderbilt.edu>

Hi All,

Ralf Eberhard of IBM helped me resolve this off list.  The key was to temporarily make testnsd1 and testnsd3 not be quorum nodes by making sure GPFS was down and then executing:

mmchnode --nonquorum -N testnsd1,testnsd3 --force

That gave me some scary messages about overriding normal GPFS quorum semantics, but nce that was done I was able to run an ?mmstartup -a? and bring up the cluster!  Once it was up and I had verified things were working properly I then shut it back down so that I could rerun the mmchnode (without the ?force) to make testnsd1 and testnsd3 quorum nodes again.

Thanks to all who helped me out here?

Kevin

On Sep 20, 2017, at 2:07 PM, Edward Wahl <ewahl at osc.edu<mailto:ewahl at osc.edu>> wrote:


So who was the ccrmaster before?
What is/was the quorum config?  (tiebreaker disks?)

what does 'mmccr check' say?


Have you set DEBUG=1 and tried mmstartup to see if it teases out any more info
from the error?


Ed


On Wed, 20 Sep 2017 16:27:48 +0000
"Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu>> wrote:

Hi Ed,

Thanks for the suggestion ? that?s basically what I had done yesterday after
Googling and getting a hit or two on the IBM DeveloperWorks site.  I?m
including some output below which seems to show that I?ve got everything set
up but it?s still not working.

Am I missing something?  We don?t use CCR on our production cluster (and this
experience doesn?t make me eager to do so!), so I?m not that familiar with
it...

Kevin

/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ps -ef | grep mmccr | grep -v
grep" | sort testdellnode1:  root      2583     1  0 May30 ?
00:10:33 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testdellnode1:  root      6694  2583  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      2023  5828  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testgateway:  root      5828     1  0 Sep18 ?
00:00:19 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root     19356  4628  0 11:19 tty1
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd1:
root      4628     1  0 Sep19 tty1
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root     22149  2983  0 11:16 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd2:
root      2983     1  0 Sep18 ?
00:00:27 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root     15685  6557  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 testnsd3:
root      6557     1  0 Sep19 ?
00:00:04 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root     29424  6512  0 11:19 ?
00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
testsched:  root      6512     1  0 Sep18 ?
00:00:20 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor
15 /var/mmfs/gen root at testnsd2# mmstartup -a get file failed: Not enough CCR
quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr
fget mmsdrfs.  Return code: 158 mmstartup: Command failed. Examine previous
error messages to determine cause. /var/mmfs/gen root at testnsd2# mmdsh
-F /tmp/cluster.hostnames "ls -l /var/mmfs/ccr" | sort testdellnode1:
drwxr-xr-x 2 root root 4096 Mar  3  2017 cached testdellnode1:  drwxr-xr-x 2
root root 4096 Nov 10  2016 committed testdellnode1:  -rw-r--r-- 1 root
root   99 Nov 10  2016 ccr.nodes testdellnode1:  total 12 testgateway:
drwxr-xr-x. 2 root root 4096 Jun 29  2016 committed testgateway:  drwxr-xr-x.
2 root root 4096 Mar  3  2017 cached testgateway:  -rw-r--r--. 1 root root
99 Jun 29  2016 ccr.nodes testgateway:  total 12 testnsd1:  drwxr-xr-x 2 root
root  6 Sep 19 15:38 cached testnsd1:  drwxr-xr-x 2 root root  6 Sep 19 15:38
committed testnsd1:  -rw-r--r-- 1 root root  0 Sep 19 15:39 ccr.disks
testnsd1:  -rw-r--r-- 1 root root  4 Sep 19 15:38 ccr.noauth testnsd1:
-rw-r--r-- 1 root root 99 Sep 19 15:39 ccr.nodes testnsd1:  total 8
testnsd2:  drwxr-xr-x 2 root root   22 Mar  3  2017 cached testnsd2:
drwxr-xr-x 2 root root 4096 Sep 18 11:49 committed testnsd2:  -rw------- 1
root root 4096 Sep 18 11:50 ccr.paxos.1 testnsd2:  -rw------- 1 root root
4096 Sep 18 11:50 ccr.paxos.2 testnsd2:  -rw-r--r-- 1 root root    0 Jun 29
2016 ccr.disks testnsd2:  -rw-r--r-- 1 root root   99 Jun 29  2016 ccr.nodes
testnsd2:  total 16 testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 cached
testnsd3:  drwxr-xr-x 2 root root  6 Sep 19 15:41 committed testnsd3:
-rw-r--r-- 1 root root  0 Jun 29  2016 ccr.disks testnsd3:  -rw-r--r-- 1 root
root  4 Sep 19 15:41 ccr.noauth testnsd3:  -rw-r--r-- 1 root root 99 Jun 29
2016 ccr.nodes testnsd3:  total 8 testsched:  drwxr-xr-x. 2 root root 4096
Jun 29  2016 committed testsched:  drwxr-xr-x. 2 root root 4096 Mar  3  2017
cached testsched:  -rw-r--r--. 1 root root   99 Jun 29  2016 ccr.nodes
testsched:  total 12 /var/mmfs/gen root at testnsd2# more ../ccr/ccr.nodes
3,0,10.0.6.215,,testnsd3.vampire
1,0,10.0.6.213,,testnsd1.vampire
2,0,10.0.6.214,,testnsd2.vampire
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "ls -l /var/mmfs/gen/mmsdrfs"
testnsd1:  -rw-r--r-- 1 root root 20360 Sep 19 15:21 /var/mmfs/gen/mmsdrfs
testnsd3:  -rw-r--r-- 1 root root 20360 Sep 19 15:34 /var/mmfs/gen/mmsdrfs
testnsd2:  -rw-r--r-- 1 root root 20360 Aug 25 17:34 /var/mmfs/gen/mmsdrfs
testdellnode1:  -rw-r--r-- 1 root root 20360 Aug 25
17:43 /var/mmfs/gen/mmsdrfs testgateway:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs testsched:  -rw-r--r--. 1 root root 20360 Aug
25 17:43 /var/mmfs/gen/mmsdrfs /var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames "md5sum /var/mmfs/gen/mmsdrfs"
testnsd1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd3:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testnsd2:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testdellnode1:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testgateway:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
testsched:  7120c79d9d767466c7629763abb7f730  /var/mmfs/gen/mmsdrfs
/var/mmfs/gen
root at testnsd2# mmdsh -F /tmp/cluster.hostnames
"md5sum /var/mmfs/ssl/stage/genkeyData1" testnsd3:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testnsd2:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testdellnode1:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1
testgateway:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 testsched:
ee6d345a87202a9f9d613e4862c92811  /var/mmfs/ssl/stage/genkeyData1 /var/mmfs/gen
root at testnsd2#

On Sep 20, 2017, at 10:48 AM, Edward Wahl
<ewahl at osc.edu<mailto:ewahl at osc.edu><mailto:ewahl at osc.edu>> wrote:

I've run into this before.  We didn't use to use CCR.  And restoring nodes for
us is a major pain in the rear as we only allow one-way root SSH, so we have a
number of useful little scripts to work around problems like this.

Assuming that you have all the necessary files copied to the correct
places, you can manually kick off CCR.

I think my script does something like:

(copy the encryption key info)

scp  /var/mmfs/ccr/ccr.nodes <node>:/var/mmfs/ccr/

scp /var/mmfs/gen/mmsdrfs <node>:/var/mmfs/gen/

scp /var/mmfs/ssl/stage/genkeyData1  <node>:/var/mmfs/ssl/stage/

<node>:/usr/lpp/mmfs/bin/mmcommon startCcrMonitor

you should then see like 2 copies of it running under mmksh.

Ed


On Wed, 20 Sep 2017 13:55:28 +0000
"Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
wrote:

Hi All,

testnsd1 and testnsd3 both had hardware issues (power supply and internal HD
respectively).  Given that they were 12 year old boxes, we decided to replace
them with other boxes that are a mere 7 years old ? keep in mind that this is
a test cluster.

Disabling CCR does not work, even with the undocumented ??force? option:

/var/mmfs/gen
root at testnsd2# mmchcluster --ccr-disable -p testnsd2 -s testnsd1 --force
mmchcluster: Unable to obtain the GPFS configuration file lock.
mmchcluster: GPFS was unable to obtain a lock from node testnsd1.vampire.
mmchcluster: Processing continues without lock protection.
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'testnsd1.vampire (10.0.6.213)' can't be established. ECDSA key
fingerprint is SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key
fingerprint is MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp609.vampire (10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp608.vampire
(10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp612.vampire
(10.0.21.12)' can't be established. ECDSA key fingerprint is
SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key fingerprint is
MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: testnsd3.vampire:  Host key verification failed. mmdsh:
testnsd3.vampire remote shell process had return code 255. testnsd1.vampire:
Host key verification failed. mmdsh: testnsd1.vampire remote shell process
had return code 255. vmp609.vampire:  Host key verification failed. mmdsh:
vmp609.vampire remote shell process had return code 255. vmp608.vampire:
Host key verification failed. mmdsh: vmp608.vampire remote shell process had
return code 255. vmp612.vampire:  Host key verification failed. mmdsh:
vmp612.vampire remote shell process had return code 255.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password: vmp610.vampire: Permission denied, please try again.

vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255.

Verifying GPFS is stopped on all nodes ...
The authenticity of host 'testnsd3.vampire (10.0.6.215)' can't be established.
ECDSA key fingerprint is SHA256:Ky1pkjsC/kvt4RA8PJuEh/W3vcxCJZplr2m1XHr+UwI.
ECDSA key fingerprint is MD5:55:59:a0:2a:6e:a1:00:58:85:3d:ac:86:0e:cd:2a:8a.
Are you sure you want to continue connecting (yes/no)? The authenticity of
host 'vmp612.vampire (10.0.21.12)' can't be established. ECDSA key
fingerprint is SHA256:zKXqPt8rIMZWSAYavKEuaAVIm31OGVovoWVU+dBTRPM. ECDSA key
fingerprint is MD5:72:4d:fb:22:4e:b3:0e:04:37:be:16:74:ae:ea:05:6c. Are you
sure you want to continue connecting (yes/no)? The authenticity of host
'vmp608.vampire (10.0.21.8)' can't be established. ECDSA key fingerprint is
SHA256:tvtNWN9b7/Qknb/Am8x7FzyMngi6R3f5SHBqATNtLzw. ECDSA key fingerprint is
MD5:fc:4e:87:fb:09:82:cd:67:b0:7d:7f:c7:4b:83:b9:6c. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'vmp609.vampire
(10.0.21.9)' can't be established. ECDSA key fingerprint is
SHA256:/gX6eSp/shsRboVFcUFcNCtGSfbBIWQZ/CWjA6gb17Q. ECDSA key fingerprint is
MD5:ca:4d:58:8c:91:28:25:7b:5b:b1:0d:a3:72:a3:00:bb. Are you sure you want to
continue connecting (yes/no)? The authenticity of host 'testnsd1.vampire
(10.0.6.213)' can't be established. ECDSA key fingerprint is
SHA256:WPiTtyuyzhuv+lRRpgDjLuHpyHyk/W3+c5N9SabWvnE. ECDSA key fingerprint is
MD5:26:26:2a:bf:e4:cb:1d:a8:27:35:96:ef:b5:96:e0:29. Are you sure you want to
continue connecting (yes/no)?
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:
root at vmp610.vampire<mailto:root at vmp610.vampire><mailto:root at vmp610.vampire><mailto:root at vmp610.vampire>'s
password:

testnsd3.vampire:  Host key verification failed.
mmdsh: testnsd3.vampire remote shell process had return code 255.
vmp612.vampire:  Host key verification failed.
mmdsh: vmp612.vampire remote shell process had return code 255.
vmp608.vampire:  Host key verification failed.
mmdsh: vmp608.vampire remote shell process had return code 255.
vmp609.vampire:  Host key verification failed.
mmdsh: vmp609.vampire remote shell process had return code 255.
testnsd1.vampire:  Host key verification failed.
mmdsh: testnsd1.vampire remote shell process had return code 255.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied, please try again.
vmp610.vampire:  Permission denied
(publickey,gssapi-keyex,gssapi-with-mic,password). mmdsh: vmp610.vampire
remote shell process had return code 255. mmchcluster: Command failed.
Examine previous error messages to determine cause. /var/mmfs/gen
root at testnsd2#

I believe that part of the problem may be that there are 4 client nodes that
were removed from the cluster without removing them from the cluster (done by
another SysAdmin who was in a hurry to repurpose those machines).  They?re up
and pingable but not reachable by GPFS anymore, which I?m pretty sure is
making things worse.

Nor does Loic?s suggestion of running mmcommon work (but thanks for the
suggestion!) ? actually the mmcommon part worked, but a subsequent attempt to
start the cluster up failed:

/var/mmfs/gen
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/var/mmfs/gen
root at testnsd2#

Thanks.

Kevin

On Sep 19, 2017, at 10:07 PM, IBM Spectrum Scale
<scale at us.ibm.com<mailto:scale at us.ibm.com><mailto:scale at us.ibm.com><mailto:scale at us.ibm.com>> wrote:


Hi Kevin,

Let's me try to understand the problem you have. What's the meaning of node
died here. Are you mean that there are some hardware/OS issue which cannot be
fixed and OS cannot be up anymore?

I agree with Bob that you can have a try to disable CCR temporally, restore
cluster configuration and enable it again.

Such as:

1. Login to a node which has proper GPFS config, e.g NodeA
2. Shutdown daemon in all client cluster.
3. mmchcluster --ccr-disable -p NodeA
4. mmsdrrestore -a -p NodeA
5. mmauth genkey propagate -N testnsd1, testnsd3
6. mmchcluster --ccr-enable

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=8OL9COHsb4M%2BZOyWta92acdO8K1Ez8HJfHbrCdDsmRs%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fforums%2Fhtml%2Fforum%3Fid%3D11111111-0000-0000-0000-000000000479&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=rDOjWbVnVsp5M75VorQgDtZhxMrgvwIgV%2BReJgt5ZUs%3D&reserved=0>.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in other
countries.

The forum is informally monitored as time permits and should not be used for
priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Oesterlin, Robert" ---09/20/2017 07:39:55 AM---OK ? I?ve run
across this before, and it?s because of a bug (as I recall) having to do with
CCR and

From: "Oesterlin, Robert"
<Robert.Oesterlin at nuance.com<mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com><mailto:Robert.Oesterlin at nuance.com>>
To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: 09/20/2017 07:39 AM Subject: Re: [gpfsug-discuss] CCR cluster down for
the count? Sent by:
gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>

________________________________


OK ? I?ve run across this before, and it?s because of a bug (as I recall)
having to do with CCR and quorum. What I think you can do is set the cluster
to non-ccr (mmchcluster ?ccr-disable) with all the nodes down, bring it back
up and then re-enable ccr.

I?ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From:
<gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org><mailto:gpfsug-discuss-bounces at spectrumscale.org>>
on behalf of "Buterbaugh, Kevin L"
<Kevin.Buterbaugh at Vanderbilt.Edu<mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu><mailto:Kevin.Buterbaugh at Vanderbilt.Edu>>
Reply-To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Date: Tuesday, September 19, 2017 at 4:03 PM To: gpfsug main discussion list
<gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org><mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled. It only had/has 3 NSD
servers (testnsd1, 2, and 3) and maybe 3-6 clients. testnsd3 died a while
back. I did nothing about it at the time because it was due to be life-cycled
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down. So now
resolving this has become higher priority? ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively. I?ve
done a ?mmsdrrestore -p testnsd2 -R /usr/bin/scp? on both of them. I?ve also
done a "mmccr setup -F? and copied the ccr.disks and ccr.nodes files from
testnsd2 to them. And I?ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to
testnsd1 and 3. In case it?s not obvious from the above, networking is fine ?
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS ? or run any GPFS command I get:

/root
root at testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root at testnsd2#

I?ve got to run to a meeting right now, so I hope I?m not leaving out any
crucial details here ? does anyone have an idea what I need to do? Thanks?

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu><mailto:Kevin.Buterbaugh at vanderbilt.edu>
- (615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<http://spectrumscale.org><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=d0MIeC47FlVIyiWVgLm%2FmvIKWJYwHVR2Kp9oMAPrtgM%3D&reserved=0><https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=oQ4u%2BdyyYLY7HzaOqRPEGjUVhi7AQF%2BvbvnWA4bhuXE%3D&reserved=0=<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DIbxtjdkPAM2Sbon4Lbbi4w%26m%3DmBSa534LB4C2zN59ZsJSlginQqfcrutinpAPYNDqU_Y%26s%3DYJEapknqzE2d9kwZzZuu6gEW0DzBoM-o94pXGEeCfuI%26e%3D&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=66K3H2yHjRwd%2F56tamS2itwN6%2Fg3fnVkLAl9D0M%2BWSQ%3D&reserved=0>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at
spectrumscale.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fspectrumscale.org&data=02%7C01%7CKevin.Buterbaugh%40Vanderbilt.Edu%7C745cfeaac7264124bb8c08d5003f162a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415193316350738&sdata=sVk0NNvXp4b4MnO8gUXBx0pEnAClHIGz9%2BSocg64TSQ%3D&reserved=0>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C494f0469ec084568b39608d4ffd4b8c2%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636414736486816768&sdata=kBvEL7Kp2JMGuLIL4NX3UV7h3emaayQSbHr8O1F2CXc%3D&reserved=0


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
(615)875-9633


--

Ed Wahl
Ohio Supercomputer Center
614-292-9302
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cfabfdb4659d249e2d20308d5005ae1ab%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636415312700069585&sdata=Z59ik0w%2BaK6bV2JsDxSNt%2FsqwR1ESuqkXTQVBlRjDgw%3D&reserved=0


?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/1109ba42/attachment-0002.htm>

From kkr at lbl.gov  Thu Sep 21 19:49:29 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Thu, 21 Sep 2017 11:49:29 -0700
Subject: [gpfsug-discuss] User Meeting & SPXXL in NYC
In-Reply-To: <D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
References: <OFE6385258.2DD45949-ON002581A1.002E9363-1505896124029@notes.na.collabserv.com>
	<D6187281-AFC9-4687-A74B-9FFABC79A716@lbl.gov>
Message-ID: <CB28D3BC-65C6-43F7-B8BC-2E88E99A2573@lbl.gov>

Registration space is getting tight. We decided on a room reconfiguration today to make a little more room. So if you tried to register and were told it was full try again. If it fills up again and you want to register, but can?t drop me an email and I?ll see what we can do.

Best,
Kristy

> On Sep 20, 2017, at 9:00 AM, Kristy Kallback-Rose <kkr at lbl.gov> wrote:
> 
> Thanks Doug. 
> 
> If you plan to go, *do register*. GPFS Day is free, but we need to know how many will attend. Register using the link on the HPCXXL event page below.
> 
> Cheers,
> Kristy
> 
>> On Sep 20, 2017, at 1:28 AM, Douglas O'flaherty <douglasof at us.ibm.com <mailto:douglasof at us.ibm.com>> wrote:
>> 
>> 
>> Reminder that the SPXXL day on IBM Spectrum Scale in New York is open to all. It is Thursday the 28th. There is also a Power day on Wednesday. 
>> 
>> 
>> For more information 
>> http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/ <http://hpcxxl.org/summer-2017-meeting-september-24-29-new-york-city/>
>> 
>> Doug
>> 
>> Mobile
>> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org/>
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170921/d1f7b641/attachment-0002.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:08:58 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:08:58 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>
References: <10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF8B6E6F9D.A72975C0-ON002581A3.0078D4A8-002581A3.0079AC00@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/8ea3693d/attachment-0002.htm>

From christof.schmitt at us.ibm.com  Fri Sep 22 23:10:45 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Fri, 22 Sep 2017 22:10:45 +0000
Subject: [gpfsug-discuss] Strange timestamp behaviour on NFS via CES
In-Reply-To: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>
References: <OF78317BD8.D3C7910F-ON002581A3.0079B33A@LocalDomain>,
	<10541a8ed07149ecafdbe9ac03b807b8@maxiv.lu.se>,
	<accef64e0cde48968aeca7cb9883112a@maxiv.lu.se><EA418FE7-37EC-4E03-9D99-5901A6346583@ulmer.org><bd3a5ea7d30f45e88603f5de4e2d7e1c@maxiv.lu.se>,
	<A11A8AEF-02C9-4081-9300-C2A508287960@ulmer.org>
Message-ID: <OF3932E8C8.E1CBE95E-ON002581A3.0079CED5-002581A3.0079D5BB@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170922/e1751905/attachment-0002.htm>

From bipcuds at gmail.com  Sun Sep 24 19:04:59 2017
From: bipcuds at gmail.com (Keith Ball)
Date: Sun, 24 Sep 2017 14:04:59 -0400
Subject: [gpfsug-discuss] Experience with zimon database stability,
	and best practices for backup?
Message-ID: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/b2d6a044/attachment-0002.htm>

From kkr at lbl.gov  Sun Sep 24 20:29:10 2017
From: kkr at lbl.gov (Kristy Kallback-Rose)
Date: Sun, 24 Sep 2017 12:29:10 -0700
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
In-Reply-To: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
References: <CAAxuGpEjizHqV4LDorCM4HAUpkw-TFtPBxVwkyxLKqXtV3zvbw@mail.gmail.com>
Message-ID: <CAA9oNusSp5HDYDaCCWs5jVQv6M2m15kkMu8omekS0zc8nqwiTA@mail.gmail.com>

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up
against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I
have been thinking about. I'm interested partly because of the granularity
that is lost over time like with any round robin style data collection
scheme.

So I guess one question is whether you have considered pulling the data out
to another database, looked at the SS GUI which uses a postgres db (iirc,
about to take off on a flight and can't check), or looked at the Grafana
bridge which would get data into OpenTsdb format, again iirc. Anyway, just
some things for consideration and a request to share back whatever you find
out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com> wrote:

Hello All,

In a recent Spectrum Scale performance study, we used zimon/mmperfmon to
gather metrics. During a period of 2 months, we ended up losing data twice
from the zimon database; once after the virtual disk serving both the OS
files and zimon collector and DB storage was resized, and a second time
after an unknown event (the loss was discovered when plotting in Grafana
only went back to a certain data and time; likewise, mmperfmon query output
only went back to the same time).

Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector
node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains"
settings were as follows (note that we did not hit the ceiling of 60GB
(1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is
always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest
precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10
seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10
seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}


Questions:

1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing
the aggregation domain definitions, or even simply restarting the zimon
collector?

3.) Does anyone have any "best practices" for backing up the zimon
database? We were taking weekly "snapshots" by shutting down the collector,
and making a tarball copy of the /opt/ibm/zimon directory (but the database
corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the
suspected dates of data loss; we do have a gpfs.snap from about a month
after the last data loss - would it have any useful clues? Opening a PMR
could be tricky, as it was the customer who has the support entitlement,
and the environment (specifically the old cluster definitino and the zimon
collector VM) was torn down.


Many Thanks,
  Keith

-- 
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kball at redlineperf.com <aqualkenbush at redlineperf.com>
cell: 540-557-7851 <%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170924/60dead5b/attachment-0002.htm>

From rkomandu at in.ibm.com  Mon Sep 25 06:26:15 2017
From: rkomandu at in.ibm.com (Ravi K Komanduri)
Date: Mon, 25 Sep 2017 10:56:15 +0530
Subject: [gpfsug-discuss] export nfs share on gpfs with no authentication
In-Reply-To: <OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
References: <mailman.637.1505934465.19082.gpfsug-discuss@spectrumscale.org>
	<OF5A37162D.22EB6914-ON002581A2.004FF5D9@LocalDomain>
Message-ID: <OF93AA3932.35C926C1-ON652581A6.001C6917-652581A6.001DDE07@notes.na.collabserv.com>

Jonathon,

This requires SMB service when you are at 422 PTF2. As Mike pointed out if 
you upgrade to the 4.2.3-3/4 build you will no longer hit that issue 


With Regards,
Ravi K Komanduri
Email:rkomandu at in.ibm.com


From:   "Michael L Taylor" <taylorm at us.ibm.com>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/21/2017 08:03 PM
Subject:        Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hi Jonathon,
We were able to run this scenario successfully in our lab at the latest 
released 4.2.3.4.

# /usr/lpp/mmfs/bin/mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.3.4 ".

# /usr/lpp/mmfs/bin/mmces service list -a
Enabled services: NFS
node1.test.ibm.com: NFS is running

# /usr/lpp/mmfs/bin/mmuserauth service create --data-access-method file 
--type userdefined
File authentication configuration completed successfully.

# rpm -qa | grep gpfs
gpfs.ext-4.2.3-4.x86_64
gpfs.docs-4.2.3-4.noarch
gpfs.gskit-8.0.50-75.x86_64
gpfs.gpl-4.2.3-4.noarch
gpfs.msg.en_US-4.2.3-4.noarch
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64
gpfs.base-4.2.3-4.x86_64

# rpm -qa | grep nfs-gan
nfs-ganesha-utils-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-2.3.2-0.ibm47.el7.x86_64
nfs-ganesha-gpfs-2.3.2-0.ibm47.el7.x86_64

From: gpfsug-discuss-request at spectrumscale.org
To: gpfsug-discuss at spectrumscale.org
Date: 09/20/2017 12:07 PM
Subject: gpfsug-discuss Digest, Vol 68, Issue 42
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Send gpfsug-discuss mailing list submissions to
gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=BpVUgvFT2Qwgw0hveEgQaHFwn2mjeQjeBrkXHX_aC0A&m=2oGcWc1xx6zOclryoU2BdJykABuIR118zXTmSAA8msU&s=7q0JMYVHMSGlUAYquNMlrDRF6BDj6-76Oc4VbXrvlHE&e= 

or, via email, send a message with subject or body 'help' to
gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

  1. Re: export nfs share on gpfs with no authentication
     (Jonathon A Anderson)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Sep 2017 18:55:04 +0000
From: Jonathon A Anderson <jonathon.anderson at colorado.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no
authentication
Message-ID:
<BN3PR03MB1382716A1217732854ED7C3F80610 at BN3PR03MB1382.namprd03.prod.outlook.com>

Content-Type: text/plain; charset="us-ascii"

I shouldn't need SMB for authentication if I'm only using userdefined 
authentication, though.

________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org 
<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Sobey, Richard A 
<r.sobey at imperial.ac.uk>
Sent: Wednesday, September 20, 2017 2:23:37 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

This sounded familiar to a problem I had to do with SMB and NFS. I've 
looked, and it's a different problem, but at the time I had this response.

"That would be the case when Active Directory is configured for
authentication. In that case the SMB service includes two aspects: One is
the actual SMB file server, and the second one is the service for the
Active Directory integration. Since NFS depends on authentication and id
mapping services, it requires SMB to be running."

I suspect the last paragraph is relevant in your case.

HTH

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathon A 
Anderson
Sent: 20 September 2017 06:13
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] export nfs share on gpfs with no 
authentication

Returning to this thread cause I'm having the same issue as Ilan, above.

I'm working on setting up CES in our environment after finally getting a 
blocking bugfix applied. I'm making it further now, but I'm getting an 
error when I try to create my export:


---
[root at sgate2 ~]# mmnfs export add /gpfs/summit/scratch --client 
'login*.rc.int.colorado.edu(rw,root_squash);dtn*.rc.int.colorado.edu(rw,root_squash)'
mmcesfuncs.sh: Current authentication: none is invalid.
This operation can not be completed without correct Authentication 
configuration.
Configure authentication using:   mmuserauth
mmnfs export add: Command failed. Examine previous error messages to 
determine cause.
---


When I try to configure mmuserauth, I get an error about not having SMB 
active; but I don't want to configure SMB, only NFS.


---
[root at sgate2 ~]# /usr/lpp/mmfs/bin/mmuserauth service create 
--data-access-method file --type userdefined
: SMB service not enabled. Enable SMB service first.
mmcesuserauthcrservice: Command failed. Examine previous error messages to 
determine cause.
---

How can I configure NFS exports with mmnfs without having to enable SMB?

~jonathon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=ilYETqcaNr1y1ulWWDPjVg_X9pt35O1eYBTyFwJP56Y&m=VW8gJLSqT4rru6lFZXxCFp-Y3ngi6IUydv5czoG8kTE&s=deIQZQr-qfqLqW377yNysTJI8y7QJOdbokVjlnDr2d8&e= 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/e2ed42ba/attachment-0002.htm>

From john.hearns at asml.com  Mon Sep 25 08:40:34 2017
From: john.hearns at asml.com (John Hearns)
Date: Mon, 25 Sep 2017 07:40:34 +0000
Subject: [gpfsug-discuss] SPectrum Scale on AWS
Message-ID: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>

I guess this is not news on this list, however I did see a reference to SpectrumScale  on The Register this morning,
which linked to this paper:
https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf

The article is here https://www.theregister.co.uk/2017/09/25/storage_super_club_sandwich/
12 Terabyte Helium drives now available.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/2d252a2d/attachment-0002.htm>

From mikeowen at thinkboxsoftware.com  Mon Sep 25 10:26:21 2017
From: mikeowen at thinkboxsoftware.com (Mike Owen)
Date: Mon, 25 Sep 2017 10:26:21 +0100
Subject: [gpfsug-discuss] SPectrum Scale on AWS
In-Reply-To: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <HE1PR02MB145013B88C741E9CDBA71447887A0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <CADFF-zeNCFgnyU3p8kEPeTYLEZyHOsav-2BuWi+J48Qmn3SavQ@mail.gmail.com>

Full PR release below:


https://aws.amazon.com/about-aws/whats-new/2017/09/deploy-ibm-spectrum-scale-on-the-aws-cloud-with-new-quick-start/


Posted On: Sep 13, 2017


This new Quick Start automatically deploys a highly available IBM Spectrum
Scale cluster with replication on the Amazon Web Services (AWS) Cloud, into
a configuration of your choice. (A small cluster can be deployed in about
25 minutes.)


IBM Spectrum Scale is a flexible, software-defined storage solution that
can be deployed as highly available, high-performance file storage. It can
scale in several dimensions, including performance (bandwidth and IOPS),
capacity, and number of nodes that can mount the file system. The product?s
high performance and scalability helps address the needs of applications
whose performance (or performance-to-capacity ratio) demands cannot be met
by traditional scale-up storage systems. The IBM Spectrum Scale software is
being made available through a 90-day trial license evaluation program.


This Quick Start automates the deployment of IBM Spectrum Scale on AWS for
users who require highly available access to a shared name space across
multiple instances with good performance, without requiring an in-depth
knowledge of IBM Spectrum Scale.


The Quick Start deploys IBM Network Shared Disk (NSD) storage server
instances and IBM Spectrum Scale compute instances into a virtual private
cloud (VPC) in your AWS account. Data and metadata elements are replicated
across two Availability Zones for optimal data protection. You can build a
new VPC for IBM Spectrum Scale, or deploy the software into your existing
VPC. The automated deployment provisions the IBM Spectrum Scale instances
in Auto Scaling groups for instance scaling and management.


The deployment and configuration tasks are automated by AWS CloudFormation
templates that you can customize during launch. You can also use the
templates as a starting point for your own implementation, by downloading
them from the GitHub repository
<https://github.com/aws-quickstart/quickstart-ibm-spectrum-scale>. The
Quick Start includes a guide with step-by-step deployment and configuration
instructions.


To get started with IBM Spectrum Scale on AWS, use the following resources:

   - View the architecture and details
   <https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale/>
   - View the deployment guide
   <https://s3.amazonaws.com/quickstart-reference/ibm/spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf>
   - Browse and launch other AWS Quick Start reference deployments
   <https://aws.amazon.com/quickstart/>


On 25 September 2017 at 08:40, John Hearns <john.hearns at asml.com> wrote:

> I guess this is not news on this list, however I did see a reference to
> SpectrumScale  on The Register this morning,
>
> which linked to this paper:
>
> https://s3.amazonaws.com/quickstart-reference/ibm/
> spectrum/scale/latest/doc/ibm-spectrum-scale-on-the-aws-cloud.pdf
>
>
>
> The article is here https://www.theregister.co.uk/
> 2017/09/25/storage_super_club_sandwich/
>
> 12 Terabyte Helium drives now available.
>
>
>
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. Neither the sender nor the
> company/group of companies he or she represents shall be liable for the
> proper and complete transmission of the information contained in this
> communication, or for any delay in its receipt.
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/6f7899e7/attachment-0002.htm>

From Robert.Oesterlin at nuance.com  Mon Sep 25 12:42:15 2017
From: Robert.Oesterlin at nuance.com (Oesterlin, Robert)
Date: Mon, 25 Sep 2017 11:42:15 +0000
Subject: [gpfsug-discuss] Experience with zimon database stability,
 and best practices for backup?
Message-ID: <018DE6B7-ADE3-4A01-B23C-9DB668FD95DB@nuance.com>

Another data point for Keith/Kristy,

I?ve been using Zimon for about 18 months now, and I?ll have to admit it?s been less than robust for long-term data. The biggest issue I?ve run into is the stability of the collector process. I have it crash on a fairly regular basis, most due to memory usage. This results in data loss You can configure it in a highly-available mode that should mitigate this to some degree. However, I don?t think IBM has published any details on how reliable the data collection process is.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Kristy Kallback-Rose <kkr at lbl.gov>
Reply-To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Sunday, September 24, 2017 at 2:29 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] Experience with zimon database stability, and best practices for backup?

Hi Keith,

  We have barely begun with Zimon and have not (knock, knock) run up against any loss or corruption issues with Zimon.

  However, getting data out of Zimon for various reasons is something I have been thinking about. I'm interested partly because of the granularity that is lost over time like with any round robin style data collection scheme.

So I guess one question is whether you have considered pulling the data out to another database, looked at the SS GUI which uses a postgres db (iirc, about to take off on a flight and can't check), or looked at the Grafana bridge which would get data into OpenTsdb format, again iirc. Anyway, just some things for consideration and a request to share back whatever you find out if it's off list.

Thanks, getting stink eye to go to airplane mode.

More later.

Cheers
Kristy


On Sep 24, 2017 11:05 AM, "Keith Ball" <bipcuds at gmail.com<mailto:bipcuds at gmail.com>> wrote:
Hello All,
In a recent Spectrum Scale performance study, we used zimon/mmperfmon to gather metrics. During a period of 2 months, we ended up losing data twice from the zimon database; once after the virtual disk serving both the OS files and zimon collector and DB storage was resized, and a second time after an unknown event (the loss was discovered when plotting in Grafana only went back to a certain data and time; likewise, mmperfmon query output only went back to the same time).
Details:
- Spectrum Scale 4.2.1.1 (on NSD servers); 4.2.1.2 on the zimon collector node and other clients
- Data retention in the "raw" stratum was set to 2 months; the "domains" settings were as follows (note that we did not hit the ceiling of 60GB (1GB/file * 60 files):

domains = {
        # this is the raw domain
        aggregation = 0         # aggregation factor for the raw domain is always 0.
        ram = "12g"             # amount of RAM to be used
        duration = "2m"         # amount of time that data with the highest precision is kept.
        filesize = "1g"         # maximum file size
        files = 60              # number of files.
},
{
        # this is the first aggregation domain that aggregates to 10 seconds
        aggregation = 10
        ram = "800m"            # amount of RAM to be used
        duration = "6m"         # keep aggregates for 1 week.
        filesize = "1g"         # maximum file size
        files = 10              # number of files.
},
{
        # this is the second aggregation domain that aggregates to 30*10 seconds == 5 minutes
        aggregation = 30
        ram = "800m"            # amount of RAM to be used
        duration = "1y"         # keep averages for 2 months.
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
},
{
        # this is the third aggregation domain that aggregates to 24*30*10 seconds == 2 hours
        aggregation = 24
        ram = "800m"            # amount of RAM to be used
        duration = "2y"         #
        filesize = "1g"         # maximum file size
        files = 5               # number of files.
}

Questions:
1.) Has anyone had similar issues with losing data from zimon?

2.) Are there known circumstances where data could be lost, e.g. changing the aggregation domain definitions, or even simply restarting the zimon collector?

3.) Does anyone have any "best practices" for backing up the zimon database? We were taking weekly "snapshots" by shutting down the collector, and making a tarball copy of the /opt/ibm/zimon directory (but the database corruption/data loss still crept through for various reasons).


In terms of debugging, we do not have Scale or zimon logs going back to the suspected dates of data loss; we do have a gpfs.snap from about a month after the last data loss - would it have any useful clues? Opening a PMR could be tricky, as it was the customer who has the support entitlement, and the environment (specifically the old cluster definitino and the zimon collector VM) was torn down.


Many Thanks,
  Keith

--
Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.redlineperf.com_&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=dUvbBoiPFANvyGsOER5MAnt9-mwK69adFuLFatx2Rmw&e=>
email: kball at redlineperf.com<mailto:aqualkenbush at redlineperf.com>
cell: 540-557-7851<tel:%28540%29%20557-7851>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=d6CkXN5mbyGvJQOduzX-LhJMANQgfvAV-nw_6ZgG-D4&e=>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY&r=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU&m=Qda4XyOAjxfIGGSuRrYemKl8f0MXB4mp6nhdbmkjh20&s=LkO3HEtokkzigjYqB4dIOUWLPhtikMbwcsXEakFp8DU&e=>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/27ae52b4/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Mon Sep 25 15:35:33 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Mon, 25 Sep 2017 14:35:33 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Message-ID: <1506350132.352.17.camel@imperial.ac.uk>

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL

[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/38f28726/attachment-0002.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment-0004.htm>

From christof.schmitt at us.ibm.com  Mon Sep 25 22:41:11 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Mon, 25 Sep 2017 21:41:11 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <1506350132.352.17.camel@imperial.ac.uk>
References: <1506350132.352.17.camel@imperial.ac.uk>
Message-ID: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170925/3f99ba82/attachment-0005.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment-0004.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 09:22:05 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 08:22:05 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/a5a23456/attachment-0005.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment-0004.htm>

From r.sobey at imperial.ac.uk  Tue Sep 26 10:59:13 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Tue, 26 Sep 2017 09:59:13 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <1506350132.352.17.camel@imperial.ac.uk>
	<OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com>
	<HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

Hi Christof,  thanks I?ll try it on a test cluster.

Richard

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org
Cc: gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM

Hi all,

This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.

[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids

[neuroscience2]
REVISION:1
ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0
mmsmb exportacl remove: Incorrect option: --sid
Usage:
mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]
      where:
Access is one of ALLOWED, DENIED
      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO

I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.

Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.

[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL
[E] The mmsmb exportacl remove command requires a Name or SID.

Can anyone see my mistake?

Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/9dd272cf/attachment-0005.htm>

From christof.schmitt at us.ibm.com  Tue Sep 26 21:49:09 2017
From: christof.schmitt at us.ibm.com (Christof Schmitt)
Date: Tue, 26 Sep 2017 20:49:09 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170926/b338de75/attachment-0002.htm>

From r.sobey at imperial.ac.uk  Wed Sep 27 09:02:51 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Wed, 27 Sep 2017 08:02:51 +0000
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
In-Reply-To: <OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
References: <HE1PR0602MB3225167FAB2591FEB03105D2DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>,
	<1506350132.352.17.camel@imperial.ac.uk><OFED477709.AE76EDF3-ON002581A6.007718E1-002581A6.007720B0@notes.na.collabserv.com><HE1PR0602MB3225DABBA70E8D5F14DDBB77DF7B0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
	<OFABD5105D.49A4B562-ON002581A7.00722EF8-002581A7.00725D49@notes.na.collabserv.com>
Message-ID: <HE1PR0602MB3225DB2FC64B573BD26AAAE1DF780@HE1PR0602MB3225.eurprd06.prod.outlook.com>

I?m sorry, you?re right. I can only assume my brain was looking for an SID entry so when I saw Everyone:ALLOWED/FULL it didn?t process it at all.

4.2.3-4:
[root at cesnode ~]# mmsmb exportacl list

[testces]
ACL:\Everyone:ALLOWED/FULL

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 26 September 2017 21:49
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?

The default for the "export ACL" is always to allow access to "Everyone", so that the the "export ACL" does not limit access by default, but only the file system ACL. I do not have systems with these code levels at hand, could you show the difference you see between PTF2 and PTF4?

Regards,

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Tue, Sep 26, 2017 2:59 AM


There isn?t a default ACL being applied to the export at all now, which is fine, but it differs from the behaviour in 4.2.3 PTF2.


Thanks

Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A
Sent: 26 September 2017 09:22
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


Hi Christof,  thanks I?ll try it on a test cluster.


Richard


From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt
Sent: 25 September 2017 22:41
To: gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?


4.2.3 PTF4 seems to have a fix for this area. Can you try again with that PTF installed?

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schmitt at us.ibm.com<mailto:christof.schmitt at us.ibm.com>  ||  +1-520-799-2469    (T/L: 321-2469)


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Cc:
Subject: [gpfsug-discuss] mmsmb exportacl remove - syntax changed?
Date: Mon, Sep 25, 2017 7:35 AM


Hi all,


This used to work (removing a SID from an ACL), but doesn't any more. Looks like a bug unless I'm being stupid.


[root at cesnode<file:///root at c>~]# mmsmb exportacl list neuroscience2 --viewsids


[neuroscience2]

REVISION:1

ACL:S-1-1-0:ALLOWED/FULL


[root at cesnode<file:///root at ce> ~] mmsmb exportacl remove neuroscience2 --SID S-1-1-0

mmsmb exportacl remove: Incorrect option: --sid

Usage:

mmsmb exportacl remove ExportName {Name | --user UserName | --group GroupName | --system SystemName | --SID SID} [--access Access] [--permissions Permissions] [--viewsddl] [--viewsids] [-h|--help]

      where:

Access is one of ALLOWED, DENIED

      Permissions is one of FULL, CHANGE, READ or any combination of RWXDPO


I've tried lower case SID i.e --sid, and specifying --access ALLOWED and --permissions FULL.


Omitting the --SID argument entirely simply results in GPFS telling me I must specify a Name or an SID.


[root at cesnode<file:///root at c>~]# mmsmb exportacl remove neuroscience2 --access ALLOWED --permissions FULL

[E] The mmsmb exportacl remove command requires a Name or SID.


Can anyone see my mistake?


Richard


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=c3uTUNFbPTWWcTNUNVVejlQ0xdnhBAfQdouTBlVgnjc&s=hsWeRhH-BhTEaAlrTPbJGlwCV-5Ui7t03Zcec9kywOA&e=


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=5Nn7eUPeYe291x8f39jKybESLKv_W_XtkTkS8fTR-NI&m=B-AqKIRCmLBzoWAhGn7NY-ZASOX25NuP_c_ndE8gy4A&s=S06OD3mbRedYjfwETO8tUnlOjnWT7pOX8nsYX5ebIdA&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/b11f5615/attachment-0002.htm>

From kenneth.waegeman at ugent.be  Wed Sep 27 09:16:49 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Wed, 27 Sep 2017 10:16:49 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
Message-ID: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth


From michael.holliday at crick.ac.uk  Wed Sep 27 09:25:58 2017
From: michael.holliday at crick.ac.uk (Michael Holliday)
Date: Wed, 27 Sep 2017 08:25:58 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
Message-ID: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>

Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/26da6e56/attachment-0002.htm>

From bbanister at jumptrading.com  Wed Sep 27 14:59:08 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Wed, 27 Sep 2017 13:59:08 +0000
Subject: [gpfsug-discuss] File Quotas vs Inode Limits
In-Reply-To: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
References: <HE1PR0301MB233044EDA859A9CA341D9A3DCD780@HE1PR0301MB2330.eurprd03.prod.outlook.com>
Message-ID: <c87527d09e0644c4958cd4df7f3598f4@jumptrading.com>

Actually you will get a benefit in that you can set up a callback so that users get alerted when they got over a soft quota.

We also set up a fileset quota so that the callback will automatically notify users when they exceed their block and file quotas for their fileset as well.

Hope that helps,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Michael Holliday
Sent: Wednesday, September 27, 2017 4:26 AM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] File Quotas vs Inode Limits

Note: External Email
________________________________
Hi All,

I'm in process of setting up quota for our users.  We currently have block quotas per file set, and an inode limit for each inode space. Our users have request more transparency relating to the inode limit as as it is they can't see any information.

Are there any disadvantages to implementing file quotas, and increasing the inode limits so that they will not be reached?

Michael


Michael Holliday
HPC Systems Engineer
Tel: 0203 796 3167


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170927/e9847c88/attachment-0002.htm>

From Greg.Lehmann at csiro.au  Thu Sep 28 00:44:53 2017
From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au)
Date: Wed, 27 Sep 2017 23:44:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
Message-ID: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on 
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From bbanister at jumptrading.com  Thu Sep 28 14:21:34 2017
From: bbanister at jumptrading.com (Bryan Banister)
Date: Thu, 28 Sep 2017 13:21:34 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
Message-ID: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.


From JRLang at uwyo.edu  Thu Sep 28 15:18:52 2017
From: JRLang at uwyo.edu (Jeffrey R. Lang)
Date: Thu, 28 Sep 2017 14:18:52 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
Message-ID: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>

I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.

cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
kdump-kern.o: In function `GetOffset':
kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
kdump-kern.o: In function `KernInit':
kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
collect2: error: ld returned 1 exit status
make[1]: *** [modules] Error 1
make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
make: *** [Modules] Error 1
--------------------------------------------------------
mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
--------------------------------------------------------
mmbuildgpl: Command failed. Examine previous error messages to determine cause.
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# 
[root at bkupsvr3 ~]# uname -a
Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root at bkupsvr3 ~]# mmdiag --version

=== mmdiag: version ===
Current GPFS build: "4.2.2.3 ".
Built on Mar 16 2017 at 11:19:59

In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2

If I'm missing something can some one point me in the right direction?


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
Sent: Thursday, September 28, 2017 8:22 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Please review this site:

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html

Hope that helps,
-Bryan

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
Sent: Wednesday, September 27, 2017 6:45 PM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Note: External Email
-------------------------------------------------

I guess I may as well ask about SLES 12 SP3 as well! TIA.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
Sent: Wednesday, 27 September 2017 6:17 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] el7.4 compatibility

Hi,

Is there already some information available of gpfs (and protocols) on
el7.4 ?

Thanks!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From xhejtman at ics.muni.cz  Thu Sep 28 15:22:54 2017
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Thu, 28 Sep 2017 16:22:54 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <20170928142254.xwjvp3qwnilazer7@ics.muni.cz>

You need 4.2.3.4 GPFS version and it will work.

On Thu, Sep 28, 2017 at 02:18:52PM +0000, Jeffrey R. Lang wrote:
> I just tired to build the GPFS GPL module against the latest version of RHEL 7.4 kernel and the build fails.  The link below show that it should work.
> 
> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
> kdump-kern.o: In function `GetOffset':
> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
> kdump-kern.o: In function `KernInit':
> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
> collect2: error: ld returned 1 exit status
> make[1]: *** [modules] Error 1
> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
> make: *** [Modules] Error 1
> --------------------------------------------------------
> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
> --------------------------------------------------------
> mmbuildgpl: Command failed. Examine previous error messages to determine cause.
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# 
> [root at bkupsvr3 ~]# uname -a
> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> [root at bkupsvr3 ~]# mmdiag --version
> 
> === mmdiag: version ===
> Current GPFS build: "4.2.2.3 ".
> Built on Mar 16 2017 at 11:19:59
> 
> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my case 514.26.2
> 
> If I'm missing something can some one point me in the right direction?
> 
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister
> Sent: Thursday, September 28, 2017 8:22 AM
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Please review this site:
> 
> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
> 
> Hope that helps,
> -Bryan
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au
> Sent: Wednesday, September 27, 2017 6:45 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: Re: [gpfsug-discuss] el7.4 compatibility
> 
> Note: External Email
> -------------------------------------------------
> 
> I guess I may as well ask about SLES 12 SP3 as well! TIA.
> 
> -----Original Message-----
> From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman
> Sent: Wednesday, 27 September 2017 6:17 PM
> To: gpfsug-discuss at spectrumscale.org
> Subject: [gpfsug-discuss] el7.4 compatibility
> 
> Hi,
> 
> Is there already some information available of gpfs (and protocols) on
> el7.4 ?
> 
> Thanks!
> 
> Kenneth
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> ________________________________
> 
> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-- 
Luk?? Hejtm?nek


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:23:53 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:23:53 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
Message-ID: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>

The 7.4 kernels are listed as having been tested by IBM.

Having said that, we have clients running 7.4 kernel and its OK, but we
are 4.2.3.4efix2, so bump versions...

Simon

On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
JRLang at uwyo.edu> wrote:

>I just tired to build the GPFS GPL module against the latest version of
>RHEL 7.4 kernel and the build fails.  The link below show that it should
>work.
>
>cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>kdump-kern.o: In function `GetOffset':
>kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>kdump-kern.o: In function `KernInit':
>kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>collect2: error: ld returned 1 exit status
>make[1]: *** [modules] Error 1
>make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>make: *** [Modules] Error 1
>--------------------------------------------------------
>mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>--------------------------------------------------------
>mmbuildgpl: Command failed. Examine previous error messages to determine
>cause.
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]#
>[root at bkupsvr3 ~]# uname -a
>Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>[root at bkupsvr3 ~]# mmdiag --version
>
>=== mmdiag: version ===
>Current GPFS build: "4.2.2.3 ".
>Built on Mar 16 2017 at 11:19:59
>
>In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>case 514.26.2
>
>If I'm missing something can some one point me in the right direction?
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>Banister
>Sent: Thursday, September 28, 2017 8:22 AM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Please review this site:
>
>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>
>Hope that helps,
>-Bryan
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>Greg.Lehmann at csiro.au
>Sent: Wednesday, September 27, 2017 6:45 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Note: External Email
>-------------------------------------------------
>
>I guess I may as well ask about SLES 12 SP3 as well! TIA.
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>Waegeman
>Sent: Wednesday, 27 September 2017 6:17 PM
>To: gpfsug-discuss at spectrumscale.org
>Subject: [gpfsug-discuss] el7.4 compatibility
>
>Hi,
>
>Is there already some information available of gpfs (and protocols) on
>el7.4 ?
>
>Thanks!
>
>Kenneth
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>________________________________
>
>Note: This email is for the confidential use of the named addressee(s)
>only and may contain proprietary, confidential or privileged information.
>If you are not the intended recipient, you are hereby notified that any
>review, dissemination or copying of this email is strictly prohibited,
>and to please notify the sender immediately and destroy this email and
>any attachments. Email transmission cannot be guaranteed to be secure or
>error-free. The Company, therefore, does not make any guarantees as to
>the completeness or accuracy of this email or any attachments. This email
>is for informational purposes only and does not constitute a
>recommendation, offer, request or solicitation of any kind to buy, sell,
>subscribe, redeem or perform any type of transaction of a financial
>product.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From kenneth.waegeman at ugent.be  Thu Sep 28 15:36:04 2017
From: kenneth.waegeman at ugent.be (Kenneth Waegeman)
Date: Thu, 28 Sep 2017 16:36:04 +0200
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
Message-ID: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>


On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
> The 7.4 kernels are listed as having been tested by IBM.
Hi,

Were did you find this?
>
> Having said that, we have clients running 7.4 kernel and its OK, but we
> are 4.2.3.4efix2, so bump versions...
Do you have some information about the efix2? Is this for 7.4 ? And 
where should we find this :-)

Thank you!

Kenneth

>
> Simon
>
> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on behalf
> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf of
> JRLang at uwyo.edu> wrote:
>
>> I just tired to build the GPFS GPL module against the latest version of
>> RHEL 7.4 kernel and the build fails.  The link below show that it should
>> work.
>>
>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>> kdump-kern.o: In function `GetOffset':
>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>> kdump-kern.o: In function `KernInit':
>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>> collect2: error: ld returned 1 exit status
>> make[1]: *** [modules] Error 1
>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>> make: *** [Modules] Error 1
>> --------------------------------------------------------
>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>> --------------------------------------------------------
>> mmbuildgpl: Command failed. Examine previous error messages to determine
>> cause.
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]#
>> [root at bkupsvr3 ~]# uname -a
>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>> [root at bkupsvr3 ~]# mmdiag --version
>>
>> === mmdiag: version ===
>> Current GPFS build: "4.2.2.3 ".
>> Built on Mar 16 2017 at 11:19:59
>>
>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>> case 514.26.2
>>
>> If I'm missing something can some one point me in the right direction?
>>
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>> Banister
>> Sent: Thursday, September 28, 2017 8:22 AM
>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Please review this site:
>>
>> https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html
>>
>> Hope that helps,
>> -Bryan
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>> Greg.Lehmann at csiro.au
>> Sent: Wednesday, September 27, 2017 6:45 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>
>> Note: External Email
>> -------------------------------------------------
>>
>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>
>> -----Original Message-----
>> From: gpfsug-discuss-bounces at spectrumscale.org
>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>> Waegeman
>> Sent: Wednesday, 27 September 2017 6:17 PM
>> To: gpfsug-discuss at spectrumscale.org
>> Subject: [gpfsug-discuss] el7.4 compatibility
>>
>> Hi,
>>
>> Is there already some information available of gpfs (and protocols) on
>> el7.4 ?
>>
>> Thanks!
>>
>> Kenneth
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>> ________________________________
>>
>> Note: This email is for the confidential use of the named addressee(s)
>> only and may contain proprietary, confidential or privileged information.
>> If you are not the intended recipient, you are hereby notified that any
>> review, dissemination or copying of this email is strictly prohibited,
>> and to please notify the sender immediately and destroy this email and
>> any attachments. Email transmission cannot be guaranteed to be secure or
>> error-free. The Company, therefore, does not make any guarantees as to
>> the completeness or accuracy of this email or any attachments. This email
>> is for informational purposes only and does not constitute a
>> recommendation, offer, request or solicitation of any kind to buy, sell,
>> subscribe, redeem or perform any type of transaction of a financial
>> product.
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Thu Sep 28 15:45:25 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Thu, 28 Sep 2017 14:45:25 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
Message-ID: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but we
>> are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf
>> of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org on behalf
>>of
>> JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version of
>>> RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should
>>> work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine
>>> cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In my
>>> case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>> 
>>>https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.ht
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth
>>> Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named addressee(s)
>>> only and may contain proprietary, confidential or privileged
>>>information.
>>> If you are not the intended recipient, you are hereby notified that any
>>> review, dissemination or copying of this email is strictly prohibited,
>>> and to please notify the sender immediately and destroy this email and
>>> any attachments. Email transmission cannot be guaranteed to be secure
>>>or
>>> error-free. The Company, therefore, does not make any guarantees as to
>>> the completeness or accuracy of this email or any attachments. This
>>>email
>>> is for informational purposes only and does not constitute a
>>> recommendation, offer, request or solicitation of any kind to buy,
>>>sell,
>>> subscribe, redeem or perform any type of transaction of a financial
>>> product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


From aaron.s.knister at nasa.gov  Fri Sep 29 02:59:39 2017
From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP])
Date: Fri, 29 Sep 2017 01:59:39 +0000
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
Message-ID: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>

Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively, any word on a 4.2.3.5 release date?

-Aaron


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/36aa7795/attachment-0002.htm>

From john.hearns at asml.com  Fri Sep 29 10:02:26 2017
From: john.hearns at asml.com (John Hearns)
Date: Fri, 29 Sep 2017 09:02:26 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <D5F2C958.615BE%s.j.thompson@bham.ac.uk>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
Message-ID: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols)
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that
>>>any  review, dissemination or copying of this email is strictly
>>>prohibited,  and to please notify the sender immediately and destroy
>>>this email and  any attachments. Email transmission cannot be
>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>not make any guarantees as to  the completeness or accuracy of this
>>>email or any attachments. This email  is for informational purposes
>>>only and does not constitute a  recommendation, offer, request or
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.


From r.sobey at imperial.ac.uk  Fri Sep 29 10:04:49 2017
From: r.sobey at imperial.ac.uk (Sobey, Richard A)
Date: Fri, 29 Sep 2017 09:04:49 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
Message-ID: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>

Efixes (in my one time only limited experience!) come direct from IBM as a result of a PMR. 
Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
Sent: 29 September 2017 10:02
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility

Simon,
I would appreciate a heads up on that AFM issue.
I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is if a remote NFS mount goes down then an asynchronous operation such as a read can be stopped.

I must admit to being not clued up on how the efixes are distributed. I downloaded the 4.2.3.4 installer for Linux yesterday.
Should I be searching for additional fix packs on top of that (which I am in fact doing now).

John H


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon Thompson (IT Research Support)
Sent: Thursday, September 28, 2017 4:45 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] el7.4 compatibility


Aren't listed as tested

Sorry ...
4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM issue we have.

Simon

On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
<kenneth.waegeman at ugent.be> wrote:

>
>
>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>> The 7.4 kernels are listed as having been tested by IBM.
>Hi,
>
>Were did you find this?
>>
>> Having said that, we have clients running 7.4 kernel and its OK, but 
>> we are 4.2.3.4efix2, so bump versions...
>Do you have some information about the efix2? Is this for 7.4 ? And 
>where should we find this :-)
>
>Thank you!
>
>Kenneth
>
>>
>> Simon
>>
>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on 
>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>on behalf of  JRLang at uwyo.edu> wrote:
>>
>>> I just tired to build the GPFS GPL module against the latest version 
>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it 
>>>should  work.
>>>
>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>> kdump-kern.o: In function `GetOffset':
>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>> kdump-kern.o: In function `KernInit':
>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [modules] Error 1
>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>> make: *** [Modules] Error 1
>>> --------------------------------------------------------
>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT 2017.
>>> --------------------------------------------------------
>>> mmbuildgpl: Command failed. Examine previous error messages to 
>>>determine  cause.
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]#
>>> [root at bkupsvr3 ~]# uname -a
>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat 
>>>Sep 9
>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bkupsvr3 ~]# mmdiag --version
>>>
>>> === mmdiag: version ===
>>> Current GPFS build: "4.2.2.3 ".
>>> Built on Mar 16 2017 at 11:19:59
>>>
>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In 
>>> my case 514.26.2
>>>
>>> If I'm missing something can some one point me in the right direction?
>>>
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan 
>>> Banister
>>> Sent: Thursday, September 28, 2017 8:22 AM
>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Please review this site:
>>>
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>ml
>>>
>>> Hope that helps,
>>> -Bryan
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Greg.Lehmann at csiro.au
>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Note: External Email
>>> -------------------------------------------------
>>>
>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>
>>> -----Original Message-----
>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of 
>>> Kenneth Waegeman
>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>> To: gpfsug-discuss at spectrumscale.org
>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>
>>> Hi,
>>>
>>> Is there already some information available of gpfs (and protocols) 
>>> on
>>> el7.4 ?
>>>
>>> Thanks!
>>>
>>> Kenneth
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0 _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>> tqc6pw%3D&reserved=0
>>>
>>>
>>> ________________________________
>>>
>>> Note: This email is for the confidential use of the named
>>>addressee(s)  only and may contain proprietary, confidential or 
>>>privileged information.
>>> If you are not the intended recipient, you are hereby notified that 
>>>any  review, dissemination or copying of this email is strictly 
>>>prohibited,  and to please notify the sender immediately and destroy 
>>>this email and  any attachments. Email transmission cannot be 
>>>guaranteed to be secure or  error-free. The Company, therefore, does 
>>>not make any guarantees as to  the completeness or accuracy of this 
>>>email or any attachments. This email  is for informational purposes 
>>>only and does not constitute a  recommendation, offer, request or 
>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform 
>>>any type of transaction of a financial  product.
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0  _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>>
>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>pw%3D&reserved=0
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>> pw%3D&reserved=0
>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From S.J.Thompson at bham.ac.uk  Fri Sep 29 10:39:43 2017
From: S.J.Thompson at bham.ac.uk (Simon Thompson (IT Research Support))
Date: Fri, 29 Sep 2017 09:39:43 +0000
Subject: [gpfsug-discuss] el7.4 compatibility
In-Reply-To: <HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
References: <f98260ac-6956-530b-bad8-2ac153b7b655@ugent.be>
	<0285ed0191fa43ac9b1f1b3e36a1f015@exch1-cdc.nexus.csiro.au>
	<d948e58f5bfd470999aa6d575ce62546@jumptrading.com>
	<BN6PR05MB3330A4BE415D42D43F72DE40A8790@BN6PR05MB3330.namprd05.prod.outlook.com>
	<D5F2C44F.615B5%s.j.thompson@bham.ac.uk>
	<087bdf18-5763-0154-8515-7b9d04e5b302@ugent.be>
	<D5F2C958.615BE%s.j.thompson@bham.ac.uk>
	<HE1PR02MB145034ADD3CC5CCFB256B8AB887E0@HE1PR02MB1450.eurprd02.prod.outlook.com>
	<HE1PR0602MB322509E64B65AEEFEEA1117CDF7E0@HE1PR0602MB3225.eurprd06.prod.outlook.com>
Message-ID: <D5F3D1A8.61653%s.j.thompson@bham.ac.uk>

Correct they some from IBM support.

The AFM issue we have (and is fixed in the efix) is if you have client
code running on the AFM cache that uses truncate. The AFM write coalescing
processing does something funny with it, so the file isn't truncated and
then the data you write afterwards isn't copied back to home.

We found this with ABAQUS code running on our HPC nodes onto the AFM
cache, I.e. At home, the final packed output file from ABAQUS is corrupt
as its the "untruncated and then filled" version of the file (so just a
big blob of empty data). I would guess that anything using truncate would
see the same issue.

4.2.3.x: APAR IV99796

See IBM Flash Alert at:
http://www-01.ibm.com/support/docview.wss?uid=ssg1S1010629&myns=s033&mynp=O
CSTXKQY&mynp=OCSWJ00&mync=E&cm_sp=s033-_-OCSTXKQY-OCSWJ00-_-E


Its remedied in efix2, of course remember that an efix has not gone
through a full testing validation cycle (otherwise it would be a PTF), but
we have not seen any issues in our environments running 4.2.3.4efix2.

Simon

On 29/09/2017, 10:04, "gpfsug-discuss-bounces at spectrumscale.org on behalf
of Sobey, Richard A" <gpfsug-discuss-bounces at spectrumscale.org on behalf
of r.sobey at imperial.ac.uk> wrote:

>Efixes (in my one time only limited experience!) come direct from IBM as
>a result of a PMR.
>Richard
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of John Hearns
>Sent: 29 September 2017 10:02
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>Simon,
>I would appreciate a heads up on that AFM issue.
>I upgraded to 4.2.3.4 this morning, to deal with an AFM issue, which is
>if a remote NFS mount goes down then an asynchronous operation such as a
>read can be stopped.
>
>I must admit to being not clued up on how the efixes are distributed. I
>downloaded the 4.2.3.4 installer for Linux yesterday.
>Should I be searching for additional fix packs on top of that (which I am
>in fact doing now).
>
>John H
>
>
>
>
>
>-----Original Message-----
>From: gpfsug-discuss-bounces at spectrumscale.org
>[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon
>Thompson (IT Research Support)
>Sent: Thursday, September 28, 2017 4:45 PM
>To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>Subject: Re: [gpfsug-discuss] el7.4 compatibility
>
>
>Aren't listed as tested
>
>Sorry ...
>4.2.3.4 we have used with 7.4 as well, efix2 includes a fix for an AFM
>issue we have.
>
>Simon
>
>On 28/09/2017, 15:36, "kenneth.waegeman at ugent.be"
><kenneth.waegeman at ugent.be> wrote:
>
>>
>>
>>On 28/09/17 16:23, Simon Thompson (IT Research Support) wrote:
>>> The 7.4 kernels are listed as having been tested by IBM.
>>Hi,
>>
>>Were did you find this?
>>>
>>> Having said that, we have clients running 7.4 kernel and its OK, but
>>> we are 4.2.3.4efix2, so bump versions...
>>Do you have some information about the efix2? Is this for 7.4 ? And
>>where should we find this :-)
>>
>>Thank you!
>>
>>Kenneth
>>
>>>
>>> Simon
>>>
>>> On 28/09/2017, 15:18, "gpfsug-discuss-bounces at spectrumscale.org on
>>>behalf  of Jeffrey R. Lang" <gpfsug-discuss-bounces at spectrumscale.org
>>>on behalf of  JRLang at uwyo.edu> wrote:
>>>
>>>> I just tired to build the GPFS GPL module against the latest version
>>>>of  RHEL 7.4 kernel and the build fails.  The link below show that it
>>>>should  work.
>>>>
>>>> cc kdump.o kdump-kern.o kdump-kern-dwarfs.o -o kdump     -lpthread
>>>> kdump-kern.o: In function `GetOffset':
>>>> kdump-kern.c:(.text+0x9): undefined reference to `page_offset_base'
>>>> kdump-kern.o: In function `KernInit':
>>>> kdump-kern.c:(.text+0x58): undefined reference to `page_offset_base'
>>>> collect2: error: ld returned 1 exit status
>>>> make[1]: *** [modules] Error 1
>>>> make[1]: Leaving directory `/usr/lpp/mmfs/src/gpl-linux'
>>>> make: *** [Modules] Error 1
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Building GPL module failed at Thu Sep 28 08:12:14 MDT
>>>>2017.
>>>> --------------------------------------------------------
>>>> mmbuildgpl: Command failed. Examine previous error messages to
>>>>determine  cause.
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]#
>>>> [root at bkupsvr3 ~]# uname -a
>>>> Linux bkupsvr3.arcc.uwyo.edu 3.10.0-693.2.2.el7.x86_64 #1 SMP Sat
>>>>Sep 9
>>>> 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>> [root at bkupsvr3 ~]# mmdiag --version
>>>>
>>>> === mmdiag: version ===
>>>> Current GPFS build: "4.2.2.3 ".
>>>> Built on Mar 16 2017 at 11:19:59
>>>>
>>>> In order to use GPFS with RHEL 7.4 I have to use a 7.3 kernel.  In
>>>> my case 514.26.2
>>>>
>>>> If I'm missing something can some one point me in the right direction?
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan
>>>> Banister
>>>> Sent: Thursday, September 28, 2017 8:22 AM
>>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Please review this site:
>>>>
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>>w.ibm.com%2Fsupport%2Fknowledgecenter%2Fen%2FSTXKQY%2Fgpfsclustersfaq
>>>>.ht&data=01%7C01%7Cjohn.hearns%40asml.com%7C1c91f855bc124c31f81a08d50
>>>>67f949a%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=nK6KEzCD62kU3njL
>>>>kIFKL69V3jyN836K5pHMX19tWk8%3D&reserved=0
>>>>ml
>>>>
>>>> Hope that helps,
>>>> -Bryan
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Greg.Lehmann at csiro.au
>>>> Sent: Wednesday, September 27, 2017 6:45 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: Re: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Note: External Email
>>>> -------------------------------------------------
>>>>
>>>> I guess I may as well ask about SLES 12 SP3 as well! TIA.
>>>>
>>>> -----Original Message-----
>>>> From: gpfsug-discuss-bounces at spectrumscale.org
>>>> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of
>>>> Kenneth Waegeman
>>>> Sent: Wednesday, 27 September 2017 6:17 PM
>>>> To: gpfsug-discuss at spectrumscale.org
>>>> Subject: [gpfsug-discuss] el7.4 compatibility
>>>>
>>>> Hi,
>>>>
>>>> Is there already some information available of gpfs (and protocols)
>>>> on
>>>> el7.4 ?
>>>>
>>>> Thanks!
>>>>
>>>> Kenneth
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0 _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>>> fsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.h
>>>> earns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944e
>>>> b2a39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Y
>>>> tqc6pw%3D&reserved=0
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> Note: This email is for the confidential use of the named
>>>>addressee(s)  only and may contain proprietary, confidential or
>>>>privileged information.
>>>> If you are not the intended recipient, you are hereby notified that
>>>>any  review, dissemination or copying of this email is strictly
>>>>prohibited,  and to please notify the sender immediately and destroy
>>>>this email and  any attachments. Email transmission cannot be
>>>>guaranteed to be secure or  error-free. The Company, therefore, does
>>>>not make any guarantees as to  the completeness or accuracy of this
>>>>email or any attachments. This email  is for informational purposes
>>>>only and does not constitute a  recommendation, offer, request or
>>>>solicitation of any kind to buy, sell,  subscribe, redeem or perform
>>>>any type of transaction of a financial  product.
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0  _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>>
>>>>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>>>sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>>>rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>>>39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>>>pw%3D&reserved=0
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpf
>>> sug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hea
>>> rns%40asml.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a
>>> 39d93e96cad61fc%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6
>>> pw%3D&reserved=0
>>
>
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.o
>rg%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml
>.com%7C1c91f855bc124c31f81a08d5067f949a%7Caf73baa8f5944eb2a39d93e96cad61fc
>%7C1&sdata=NFMrLpW8bakClzmF4zCC%2BUb2oi04Qw3N6cc2Ytqc6pw%3D&reserved=0
>-- The information contained in this communication and any attachments is
>confidential and may be privileged, and is for the sole use of the
>intended recipient(s). Any unauthorized review, use, disclosure or
>distribution is prohibited. Unless explicitly stated otherwise in the
>body of this communication or the attachment thereto (if any), the
>information is provided on an AS-IS basis without any express or implied
>warranties or liabilities. To the extent you are relying on this
>information, you are doing so at your own risk. If you are not the
>intended recipient, please notify the sender immediately by replying to
>this message and destroy all copies of this message and any attachments.
>Neither the sender nor the company/group of companies he or she
>represents shall be liable for the proper and complete transmission of
>the information contained in this communication, or for any delay in its
>receipt.
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From scale at us.ibm.com  Fri Sep 29 13:26:51 2017
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Fri, 29 Sep 2017 07:26:51 -0500
Subject: [gpfsug-discuss] Latest recommended 4.2 efix?
In-Reply-To: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
References: <B858000C-FEA4-42B2-AF44-A63E0B06D985@nasa.gov>
Message-ID: <OF685C1BCB.06119E7F-ON852581AA.0043AD1E-862581AA.00446050@notes.na.collabserv.com>

There isn't a "recommended" efix as such.  Generally, fixes go into the
next ptf so that they go through a test cycle.  If a customer hits a
serious issue that cannot wait for the next ptf, they can request an efix
be built, but since efixes do not get the same level of rigorous testing as
a ptf, they are not generally recommended unless you report an issue and
service determines you need it.

To address your other questions:
   We are currently up to efix3 on 4.2.3.4
   We don't announce PTF dates, because they depend upon the testing;
   however, you can see that we generally release a PTF roughly every 6
   weeks and I believe ptf4 was out on 8/24

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]"
            <aaron.s.knister at nasa.gov>
To:	"discussion, gpfsug main" <gpfsug-discuss at spectrumscale.org>
Date:	09/28/2017 08:59 PM
Subject:	[gpfsug-discuss] Latest recommended 4.2 efix?
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi Everyone,

What?s the latest recommended efix release for 4.2.3.4?

I?m working on testing a 4.1 to 4.2 migration and was reminded today of
some fun bugs in 4.2.3.4 for which I think there are efixes. Alternatively,
any word on a 4.2.3.5 release date?

-Aaron

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=IVcYH9EDg-UaA4Jt2GbsxN5XN1XbvejXTX0gAzNxtpM&s=9SmogyyA6QNSWxlZrpE-vBbslts0UexwJwPzp78LgKs&e=


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170929/d741ff27/attachment-0002.gif>

From sandeep.patil at in.ibm.com  Sat Sep 30 05:02:22 2017
From: sandeep.patil at in.ibm.com (Sandeep Ramesh)
Date: Sat, 30 Sep 2017 09:32:22 +0530
Subject: [gpfsug-discuss] Spectrum Scale Enablement Material - 1H 2017
Message-ID: <OF864F88AC.3E69D527-ON652581AB.00150F44-652581AB.00163072@notes.na.collabserv.com>

Hi Folks

I was asked by Doris Conti to send the below to our Spectrum Scale User 
group.

Below is a consolidated link that list all the enablement on Spectrum 
Scale/ESS that was done in 1H 2017 - which have blogs and videos from 
development and offering management.
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/White%20Papers%20%26%20Media

Do note, Spectrum Scale developers keep blogging on the below site which 
is worth bookmarking: https://developer.ibm.com/storage/blog/
(as recent as 4 new blogs in Sept)

Thanks
Sandeep 
Linkedin: https://www.linkedin.com/in/sandeeprpatil
Spectrum Scale Dev.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170930/4e06399d/attachment-0002.htm>