[gpfsug-discuss] gpfsug-discuss Digest, Vol 48, Issue 2

Mon Jan 4 16:50:45 GMT 2016

Hi Danny,

can you be a bit more specific, which resources get exhausted ?
are you talking about operating system or Spectrum Scale resources
(filecache or pagepool) ?

when you migrate the files ( i assume policy engine) did you specify which
nodes do the migration ( -N hostnames) or did you just run mmapplypolicy
without anything ?

can you post either your entire mmlsconfig or at least output of :

for i in maxFilesToCache pagepool maxStatCache nsdMinWorkerThreads
nsdMaxWorkerThreads worker1Threads; do mmlsconfig $i ; done

mmlsfs , mmlsnsd and mmlscluster output might be useful too..

Hi Sven

Sure.

the resorces that are exhausted  are CPU and RAM, i can note that when the
system is make the Pool migration the SMB service is down ( very slow equal
to down)

when i migrate the files i was make probes with one node, two nodes (nsd
nodes) and all  nodes ( Nsd and protocol nodes)

there is the output

maxFilesToCache 1000000
pagepool 20G
maxStatCache 1000
nsdMinWorkerThreads 8
nsdMinWorkerThreads 1 [cesNodes]
nsdMaxWorkerThreads 512
nsdMaxWorkerThreads 2 [cesNodes]
worker1Threads 48
worker1Threads 800 [cesNodes]

mmlsfs  ( there is two file system one for CES Shared Root and another for
data )

le system attributes for /dev/datafs:
=======================================
flag                value                    description
------------------- ------------------------
-----------------------------------
 -f                 8192                     Minimum fragment size in bytes
 -i                 4096                     Inode size in bytes
 -I                 16384                    Indirect block size in bytes
 -m                 1                        Default number of metadata
replicas
 -M                 2                        Maximum number of metadata
replicas
 -r                 1                        Default number of data
replicas
 -R                 2                        Maximum number of data
replicas
 -j                 cluster                  Block allocation type
 -D                 nfs4                     File locking semantics in
effect
 -k                 nfs4                     ACL semantics in effect
 -n                 32                       Estimated number of nodes that
will mount file system
 -B                 262144                   Block size
 -Q                 none                     Quotas accounting enabled
                    none                     Quotas enforced
                    none                     Default quotas enabled
 --perfileset-quota No                       Per-fileset quota enforcement
 --filesetdf        No                       Fileset df enabled?
 -V                 15.01 (4.2.0.0)          File system version
 --create-time      Wed Dec 23 09:31:07 2015 File system creation time
 -z                 No                       Is DMAPI enabled?
 -L                 4194304                  Logfile size
 -E                 Yes                      Exact mtime mount option
 -S                 No                       Suppress atime mount option
 -K                 whenpossible             Strict replica allocation
option
 --fastea           Yes                      Fast external attributes
enabled?
 --encryption       No                       Encryption enabled?
 --inode-limit      55325440                 Maximum number of inodes in
all inode spaces
 --log-replicas     0                        Number of log replicas
 --is4KAligned      Yes                      is4KAligned?
 --rapid-repair     Yes                      rapidRepair enabled?
 --write-cache-threshold 0                   HAWC Threshold (max 65536)
 -P                 system;T12TB;T26TB       Disk storage pools in file
system
 -d
nsd2;nsd3;nsd4;nsd5;nsd6;nsd7;nsd8;nsd9;nsd16;nsd17;nsd18;nsd19;nsd20;nsd15;nsd21;nsd10;nsd11;nsd12;nsd13;nsd14
  Disks in file system
 -A                 yes                      Automatic mount option
 -o                 none                     Additional mount options
 -T                 /datafs                  Default mount point
 --mount-priority   0                        Mount priority

File system attributes for /dev/sharerfs:
=========================================
flag                value                    description
------------------- ------------------------
-----------------------------------
 -f                 8192                     Minimum fragment size in bytes
 -i                 4096                     Inode size in bytes
 -I                 16384                    Indirect block size in bytes
 -m                 1                        Default number of metadata
replicas
 -M                 2                        Maximum number of metadata
replicas
 -r                 1                        Default number of data
replicas
 -R                 2                        Maximum number of data
replicas
 -j                 scatter                  Block allocation type
 -D                 nfs4                     File locking semantics in
effect
 -k                 nfs4                     ACL semantics in effect
 -n                 100                      Estimated number of nodes that
will mount file system
 -B                 262144                   Block size
 -Q                 none                     Quotas accounting enabled
                    none                     Quotas enforced
                    none                     Default quotas enabled
 --perfileset-quota No                       Per-fileset quota enforcement
 --filesetdf        No                       Fileset df enabled?
 -V                 15.01 (4.2.0.0)          File system version
 --create-time      Tue Dec 22 17:19:33 2015 File system creation time
 -z                 No                       Is DMAPI enabled?
 -L                 4194304                  Logfile size
 -E                 Yes                      Exact mtime mount option
 -S                 No                       Suppress atime mount option
 -K                 whenpossible             Strict replica allocation
option
 --fastea           Yes                      Fast external attributes
enabled?
 --encryption       No                       Encryption enabled?
 --inode-limit      102656                   Maximum number of inodes
 --log-replicas     0                        Number of log replicas
 --is4KAligned      Yes                      is4KAligned?
 --rapid-repair     Yes                      rapidRepair enabled?
 --write-cache-threshold 0                   HAWC Threshold (max 65536)
 -P                 system                   Disk storage pools in file
system
 -d                 nsd1                     Disks in file system
 -A                 yes                      Automatic mount option
 -o                 none                     Additional mount options
 -T                 /sharedr                 Default mount point
 --mount-priority   0                        Mount priority

mmlsnsd

 File system   Disk name    NSD servers
---------------------------------------------------------------------------
 datafs        nsd2         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd3         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd4         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd5         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd6         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd7         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd8         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd9         NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd15        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd16        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd17        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd18        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd19        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd20        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd21        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd10        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd11        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd12        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd13        NSDSERV01_Daemon,NSDSERV02_Daemon
 datafs        nsd14        NSDSERV01_Daemon,NSDSERV02_Daemon
 sharerfs      nsd1         NSDSERV01_Daemon,NSDSERV02_Daemon

mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         spectrum_syc.localdomain
  GPFS cluster id:           2719632319013564592
  GPFS UID domain:           spectrum_syc.localdomain
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name   IP address    Admin node name    Designation
-----------------------------------------------------------------------
   1   NSDSERV01_Daemon   172.19.20.61  NSDSERV01_Daemon
quorum-manager-perfmon
   2   NSDSERV02_Daemon   172.19.20.62  NSDSERV02_Daemon
quorum-manager-perfmon
   3   PROTSERV01_Daemon  172.19.20.63  PROTSERV01_Daemon
quorum-manager-perfmon
   4   PROTSERV02_Daemon  172.19.20.64  PROTSERV02_Daemon  manager-perfmon

and  a mmdf

disk                disk size  failure holds    holds              free GB
free GB
name                    in GB    group metadata data        in full blocks
in fragments
--------------- ------------- -------- -------- ----- --------------------
-------------------
Disks in storage pool: system (Maximum disk size allowed is 4.1 TB)
nsd2                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd3                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd4                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd5                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd6                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd7                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd8                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
nsd9                      400        1 Yes      Yes             390 ( 97%)
1 ( 0%)
                -------------                         --------------------
-------------------
(pool total)             3200                                  3120 ( 97%)
1 ( 0%)

Disks in storage pool: T12TB (Maximum disk size allowed is 4.1 TB)
nsd14                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
nsd13                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
nsd12                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
nsd11                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
nsd10                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
nsd15                     500        2 No       Yes             496 ( 99%)
1 ( 0%)
                -------------                         --------------------
-------------------
(pool total)             3000                                  2974 ( 99%)
1 ( 0%)

Disks in storage pool: T26TB (Maximum disk size allowed is 8.2 TB)
nsd21                     500        3 No       Yes             500 (100%)
1 ( 0%)
nsd20                     500        3 No       Yes             500 (100%)
1 ( 0%)
nsd19                     500        3 No       Yes             500 (100%)
1 ( 0%)
nsd18                     500        3 No       Yes             500 (100%)
1 ( 0%)
nsd17                     500        3 No       Yes             500 (100%)
1 ( 0%)
nsd16                     500        3 No       Yes             500 (100%)
1 ( 0%)
                -------------                         --------------------
-------------------
(pool total)             3000                                  3000 (100%)
1 ( 0%)

                =============                         ====================
===================
(data)                   9200                                  9093 ( 99%)
2 ( 0%)
(metadata)               3200                                  3120 ( 97%)
1 ( 0%)
                =============                         ====================
===================
(total)                  9200                                  9093 ( 99%)
2 ( 0%)

Inode Information
-----------------
Total number of used inodes in all Inode spaces:             284090
Total number of free inodes in all Inode spaces:           20318278
Total number of allocated inodes in all Inode spaces:      20602368
Total of Maximum number of inodes in all Inode spaces:     55325440

Thanks

   Danny Alexander Calderon R          

   Client Technical Specialist - CTS   
   Storage                             
   STG Colombia                        

   Phone: 57-1-6281956                 

   Mobile: 57- 318 352 9258            

   Carrera 53 Número 100-25            

    Bogotá, Colombia                   

From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	01/03/2016 05:18 PM
Subject:	gpfsug-discuss Digest, Vol 48, Issue 2
Sent by:	gpfsug-discuss-bounces at spectrumscale.org

Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
		 http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."

Today's Topics:

   1. Resource exhausted by Pool Migration
      (Danny Alexander Calderon Rodriguez)
   2. Re: Resource exhausted by Pool Migration (Sven Oehme)
   3. metadata replication question
      (Simon Thompson (Research Computing - IT Services))
   4. Re: metadata replication question (Barry Evans)
   5. Re: metadata replication question
      (Simon Thompson (Research Computing - IT Services))

----------------------------------------------------------------------

Message: 1
Date: Sun, 3 Jan 2016 15:55:59 +0000
From: "Danny Alexander Calderon Rodriguez" <dacalder at co.ibm.com>
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] Resource exhausted by Pool Migration
Message-ID: <201601031556.u03Futfw007019 at d24av01.br.ibm.com>
Content-Type: text/plain; charset="utf-8"

HI All

Actually I have a 4.2 Spectrum Scale cluster with protocol service, we are
managing small files (32K to 140K), when I try to migrate some files
(120.000 files ) the system resources of all nodes is exhausted and the
protocol nodes don't get services to client.

I wan to ask if there is any way to limit the resources consuming at the
migration time?

Thanks to all

Enviado desde IBM Verse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/cfa1d022/attachment-0001.html
>

------------------------------

Message: 2
Date: Sun, 3 Jan 2016 08:42:27 -0800
From: Sven Oehme <oehmes at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Resource exhausted by Pool Migration
Message-ID:
		 <CALssuR2z4nzTxcTN0nJTF0u0OXSiCekcQ
+DHUzMfTnGUnLU58g at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Danny,

can you be a bit more specific, which resources get exhausted ?
are you talking about operating system or Spectrum Scale resources
(filecache or pagepool) ?

when you migrate the files ( i assume policy engine) did you specify which
nodes do the migration ( -N hostnames) or did you just run mmapplypolicy
without anything ?

can you post either your entire mmlsconfig or at least output of :

for i in maxFilesToCache pagepool maxStatCache nsdMinWorkerThreads
nsdMaxWorkerThreads worker1Threads; do mmlsconfig $i ; done

mmlsfs , mmlsnsd and mmlscluster output might be useful too..

sven

On Sun, Jan 3, 2016 at 7:55 AM, Danny Alexander Calderon Rodriguez <
dacalder at co.ibm.com> wrote:

> HI All
>
> Actually I have a 4.2 Spectrum Scale cluster with protocol service, we
are managing small files (32K to 140K), when I try to migrate some files
(120.000 files ) the system resources of all nodes is exhausted and the
protocol nodes don't get services to client.
>
> I wan to ask if there is any way to limit the resources consuming at the
migration time?
>
>
> Thanks to all
>
>
>
> Enviado desde IBM Verse
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/d5f4999c/attachment-0001.html
>

------------------------------

Message: 3
Date: Sun, 3 Jan 2016 21:56:26 +0000
From: "Simon Thompson (Research Computing - IT Services)"
		 <S.J.Thompson at bham.ac.uk>
To: "gpfsug-discuss at spectrumscale.org"
		 <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] metadata replication question
Message-ID:
		 <CF45EE16DEF2FE4B9AA7FF2B6EE26545D16BD321 at EX13.adf.bham.ac.uk>
Content-Type: text/plain; charset="us-ascii"

I currently have 4 NSD servers in a cluster, two pairs in two data centres.
Data and metadata replication is currently set to 2 with metadata sitting
on sas drivers in a storewise array. I also have a vm floating between the
two data centres to guarantee quorum in one only in the event of split
brain.

Id like to add some ssd for metadata.

Should I:

Add raid1 ssd to the storewise?

Add local ssd to the nsd servers?

If I did the second, should I
 add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
 add ssd to each nsd server as raid 1, use the same failure group for each
data centre pair?
 add ssd to each nsd server not raid 1, use the dame failure group for each
data centre pair?

Or something else entirely?

What I want so survive is a split data centre situation or failure of a
single nsd server at any point...

Thoughts? Comments?

I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?

Id like a cost competitive solution that gives faster performance than the
current sas drives.

Was also thinking I might add an ssd to each nsd server for system.log pool
for hawc as well...

Thanks

Simon

------------------------------

Message: 4
Date: Sun, 3 Jan 2016 22:10:21 +0000
From: Barry Evans <bevans at pixitmedia.com>
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] metadata replication question
Message-ID: <56899C4D.4050907 at pixitmedia.com>
Content-Type: text/plain; charset="windows-1252"; Format="flowed"

Can all 4 NSD servers see all existing storwize arrays across both DC's?

Cheers,
Barry

On 03/01/2016 21:56, Simon Thompson (Research Computing - IT Services)
wrote:
> I currently have 4 NSD servers in a cluster, two pairs in two data
centres. Data and metadata replication is currently set to 2 with metadata
sitting on sas drivers in a storewise array. I also have a vm floating
between the two data centres to guarantee quorum in one only in the event
of split brain.
>
> Id like to add some ssd for metadata.
>
> Should I:
>
> Add raid1 ssd to the storewise?
>
> Add local ssd to the nsd servers?
>
> If I did the second, should I
>   add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
>   add ssd to each nsd server as raid 1, use the same failure group for
each data centre pair?
>   add ssd to each nsd server not raid 1, use the dame failure group for
each data centre pair?
>
> Or something else entirely?
>
> What I want so survive is a split data centre situation or failure of a
single nsd server at any point...
>
> Thoughts? Comments?
>
> I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?
>
> Id like a cost competitive solution that gives faster performance than
the current sas drives.
>
> Was also thinking I might add an ssd to each nsd server for system.log
pool for hawc as well...
>
> Thanks
>
> Simon
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

--

Barry Evans
Technical Director & Co-Founder
Pixit Media
Mobile: +44 (0)7950 666 248
http://www.pixitmedia.com

--

This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/5d463c6d/attachment-0001.html
>

------------------------------

Message: 5
Date: Sun, 3 Jan 2016 22:18:24 +0000
From: "Simon Thompson (Research Computing - IT Services)"
		 <S.J.Thompson at bham.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] metadata replication question
Message-ID:
		 <CF45EE16DEF2FE4B9AA7FF2B6EE26545D16BD363 at EX13.adf.bham.ac.uk>
Content-Type: text/plain; charset="us-ascii"

Yes there is extended san in place. The failure groups for the storage are
different in each dc so we guarantee that the data replication has 1 copy
per dc.

Simon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
[gpfsug-discuss-bounces at spectrumscale.org] on behalf of Barry Evans
[bevans at pixitmedia.com]
Sent: 03 January 2016 22:10
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] metadata replication question

Can all 4 NSD servers see all existing storwize arrays across both DC's?

Cheers,
Barry

On 03/01/2016 21:56, Simon Thompson (Research Computing - IT Services)
wrote:

I currently have 4 NSD servers in a cluster, two pairs in two data centres.
Data and metadata replication is currently set to 2 with metadata sitting
on sas drivers in a storewise array. I also have a vm floating between the
two data centres to guarantee quorum in one only in the event of split
brain.

Id like to add some ssd for metadata.

Should I:

Add raid1 ssd to the storewise?

Add local ssd to the nsd servers?

If I did the second, should I
 add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
 add ssd to each nsd server as raid 1, use the same failure group for each
data centre pair?
 add ssd to each nsd server not raid 1, use the dame failure group for each
data centre pair?

Or something else entirely?

What I want so survive is a split data centre situation or failure of a
single nsd server at any point...

Thoughts? Comments?

I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?

Id like a cost competitive solution that gives faster performance than the
current sas drives.

Was also thinking I might add an ssd to each nsd server for system.log pool
for hawc as well...

Thanks

Simon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

--

Barry Evans
Technical Director & Co-Founder
Pixit Media
Mobile: +44 (0)7950 666 248
http://www.pixitmedia.com

[http://www.pixitmedia.com/sig/sig-cio.jpg]
This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

End of gpfsug-discuss Digest, Vol 48, Issue 2
*********************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0F620571.gif
Type: image/gif
Size: 360 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0003.gif>