[gpfsug-discuss] gpfsug-discuss Digest, Vol 48, Issue 2
Danny Alexander Calderon Rodriguez
dacalder at co.ibm.com
Mon Jan 4 16:50:45 GMT 2016
Hi Danny,
can you be a bit more specific, which resources get exhausted ?
are you talking about operating system or Spectrum Scale resources
(filecache or pagepool) ?
when you migrate the files ( i assume policy engine) did you specify which
nodes do the migration ( -N hostnames) or did you just run mmapplypolicy
without anything ?
can you post either your entire mmlsconfig or at least output of :
for i in maxFilesToCache pagepool maxStatCache nsdMinWorkerThreads
nsdMaxWorkerThreads worker1Threads; do mmlsconfig $i ; done
mmlsfs , mmlsnsd and mmlscluster output might be useful too..
Hi Sven
Sure.
the resorces that are exhausted are CPU and RAM, i can note that when the
system is make the Pool migration the SMB service is down ( very slow equal
to down)
when i migrate the files i was make probes with one node, two nodes (nsd
nodes) and all nodes ( Nsd and protocol nodes)
there is the output
maxFilesToCache 1000000
pagepool 20G
maxStatCache 1000
nsdMinWorkerThreads 8
nsdMinWorkerThreads 1 [cesNodes]
nsdMaxWorkerThreads 512
nsdMaxWorkerThreads 2 [cesNodes]
worker1Threads 48
worker1Threads 800 [cesNodes]
mmlsfs ( there is two file system one for CES Shared Root and another for
data )
le system attributes for /dev/datafs:
=======================================
flag value description
------------------- ------------------------
-----------------------------------
-f 8192 Minimum fragment size in bytes
-i 4096 Inode size in bytes
-I 16384 Indirect block size in bytes
-m 1 Default number of metadata
replicas
-M 2 Maximum number of metadata
replicas
-r 1 Default number of data
replicas
-R 2 Maximum number of data
replicas
-j cluster Block allocation type
-D nfs4 File locking semantics in
effect
-k nfs4 ACL semantics in effect
-n 32 Estimated number of nodes that
will mount file system
-B 262144 Block size
-Q none Quotas accounting enabled
none Quotas enforced
none Default quotas enabled
--perfileset-quota No Per-fileset quota enforcement
--filesetdf No Fileset df enabled?
-V 15.01 (4.2.0.0) File system version
--create-time Wed Dec 23 09:31:07 2015 File system creation time
-z No Is DMAPI enabled?
-L 4194304 Logfile size
-E Yes Exact mtime mount option
-S No Suppress atime mount option
-K whenpossible Strict replica allocation
option
--fastea Yes Fast external attributes
enabled?
--encryption No Encryption enabled?
--inode-limit 55325440 Maximum number of inodes in
all inode spaces
--log-replicas 0 Number of log replicas
--is4KAligned Yes is4KAligned?
--rapid-repair Yes rapidRepair enabled?
--write-cache-threshold 0 HAWC Threshold (max 65536)
-P system;T12TB;T26TB Disk storage pools in file
system
-d
nsd2;nsd3;nsd4;nsd5;nsd6;nsd7;nsd8;nsd9;nsd16;nsd17;nsd18;nsd19;nsd20;nsd15;nsd21;nsd10;nsd11;nsd12;nsd13;nsd14
Disks in file system
-A yes Automatic mount option
-o none Additional mount options
-T /datafs Default mount point
--mount-priority 0 Mount priority
File system attributes for /dev/sharerfs:
=========================================
flag value description
------------------- ------------------------
-----------------------------------
-f 8192 Minimum fragment size in bytes
-i 4096 Inode size in bytes
-I 16384 Indirect block size in bytes
-m 1 Default number of metadata
replicas
-M 2 Maximum number of metadata
replicas
-r 1 Default number of data
replicas
-R 2 Maximum number of data
replicas
-j scatter Block allocation type
-D nfs4 File locking semantics in
effect
-k nfs4 ACL semantics in effect
-n 100 Estimated number of nodes that
will mount file system
-B 262144 Block size
-Q none Quotas accounting enabled
none Quotas enforced
none Default quotas enabled
--perfileset-quota No Per-fileset quota enforcement
--filesetdf No Fileset df enabled?
-V 15.01 (4.2.0.0) File system version
--create-time Tue Dec 22 17:19:33 2015 File system creation time
-z No Is DMAPI enabled?
-L 4194304 Logfile size
-E Yes Exact mtime mount option
-S No Suppress atime mount option
-K whenpossible Strict replica allocation
option
--fastea Yes Fast external attributes
enabled?
--encryption No Encryption enabled?
--inode-limit 102656 Maximum number of inodes
--log-replicas 0 Number of log replicas
--is4KAligned Yes is4KAligned?
--rapid-repair Yes rapidRepair enabled?
--write-cache-threshold 0 HAWC Threshold (max 65536)
-P system Disk storage pools in file
system
-d nsd1 Disks in file system
-A yes Automatic mount option
-o none Additional mount options
-T /sharedr Default mount point
--mount-priority 0 Mount priority
mmlsnsd
File system Disk name NSD servers
---------------------------------------------------------------------------
datafs nsd2 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd3 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd4 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd5 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd6 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd7 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd8 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd9 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd15 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd16 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd17 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd18 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd19 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd20 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd21 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd10 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd11 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd12 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd13 NSDSERV01_Daemon,NSDSERV02_Daemon
datafs nsd14 NSDSERV01_Daemon,NSDSERV02_Daemon
sharerfs nsd1 NSDSERV01_Daemon,NSDSERV02_Daemon
mmlscluster
GPFS cluster information
========================
GPFS cluster name: spectrum_syc.localdomain
GPFS cluster id: 2719632319013564592
GPFS UID domain: spectrum_syc.localdomain
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------
1 NSDSERV01_Daemon 172.19.20.61 NSDSERV01_Daemon
quorum-manager-perfmon
2 NSDSERV02_Daemon 172.19.20.62 NSDSERV02_Daemon
quorum-manager-perfmon
3 PROTSERV01_Daemon 172.19.20.63 PROTSERV01_Daemon
quorum-manager-perfmon
4 PROTSERV02_Daemon 172.19.20.64 PROTSERV02_Daemon manager-perfmon
and a mmdf
disk disk size failure holds holds free GB
free GB
name in GB group metadata data in full blocks
in fragments
--------------- ------------- -------- -------- ----- --------------------
-------------------
Disks in storage pool: system (Maximum disk size allowed is 4.1 TB)
nsd2 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd3 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd4 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd5 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd6 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd7 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd8 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
nsd9 400 1 Yes Yes 390 ( 97%)
1 ( 0%)
------------- --------------------
-------------------
(pool total) 3200 3120 ( 97%)
1 ( 0%)
Disks in storage pool: T12TB (Maximum disk size allowed is 4.1 TB)
nsd14 500 2 No Yes 496 ( 99%)
1 ( 0%)
nsd13 500 2 No Yes 496 ( 99%)
1 ( 0%)
nsd12 500 2 No Yes 496 ( 99%)
1 ( 0%)
nsd11 500 2 No Yes 496 ( 99%)
1 ( 0%)
nsd10 500 2 No Yes 496 ( 99%)
1 ( 0%)
nsd15 500 2 No Yes 496 ( 99%)
1 ( 0%)
------------- --------------------
-------------------
(pool total) 3000 2974 ( 99%)
1 ( 0%)
Disks in storage pool: T26TB (Maximum disk size allowed is 8.2 TB)
nsd21 500 3 No Yes 500 (100%)
1 ( 0%)
nsd20 500 3 No Yes 500 (100%)
1 ( 0%)
nsd19 500 3 No Yes 500 (100%)
1 ( 0%)
nsd18 500 3 No Yes 500 (100%)
1 ( 0%)
nsd17 500 3 No Yes 500 (100%)
1 ( 0%)
nsd16 500 3 No Yes 500 (100%)
1 ( 0%)
------------- --------------------
-------------------
(pool total) 3000 3000 (100%)
1 ( 0%)
============= ====================
===================
(data) 9200 9093 ( 99%)
2 ( 0%)
(metadata) 3200 3120 ( 97%)
1 ( 0%)
============= ====================
===================
(total) 9200 9093 ( 99%)
2 ( 0%)
Inode Information
-----------------
Total number of used inodes in all Inode spaces: 284090
Total number of free inodes in all Inode spaces: 20318278
Total number of allocated inodes in all Inode spaces: 20602368
Total of Maximum number of inodes in all Inode spaces: 55325440
Thanks
Danny Alexander Calderon R
Client Technical Specialist - CTS
Storage
STG Colombia
Phone: 57-1-6281956
Mobile: 57- 318 352 9258
Carrera 53 Número 100-25
Bogotá, Colombia
From: gpfsug-discuss-request at spectrumscale.org
To: gpfsug-discuss at spectrumscale.org
Date: 01/03/2016 05:18 PM
Subject: gpfsug-discuss Digest, Vol 48, Issue 2
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Send gpfsug-discuss mailing list submissions to
gpfsug-discuss at spectrumscale.org
To subscribe or unsubscribe via the World Wide Web, visit
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
gpfsug-discuss-request at spectrumscale.org
You can reach the person managing the list at
gpfsug-discuss-owner at spectrumscale.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."
Today's Topics:
1. Resource exhausted by Pool Migration
(Danny Alexander Calderon Rodriguez)
2. Re: Resource exhausted by Pool Migration (Sven Oehme)
3. metadata replication question
(Simon Thompson (Research Computing - IT Services))
4. Re: metadata replication question (Barry Evans)
5. Re: metadata replication question
(Simon Thompson (Research Computing - IT Services))
----------------------------------------------------------------------
Message: 1
Date: Sun, 3 Jan 2016 15:55:59 +0000
From: "Danny Alexander Calderon Rodriguez" <dacalder at co.ibm.com>
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] Resource exhausted by Pool Migration
Message-ID: <201601031556.u03Futfw007019 at d24av01.br.ibm.com>
Content-Type: text/plain; charset="utf-8"
HI All
Actually I have a 4.2 Spectrum Scale cluster with protocol service, we are
managing small files (32K to 140K), when I try to migrate some files
(120.000 files ) the system resources of all nodes is exhausted and the
protocol nodes don't get services to client.
I wan to ask if there is any way to limit the resources consuming at the
migration time?
Thanks to all
Enviado desde IBM Verse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/cfa1d022/attachment-0001.html
>
------------------------------
Message: 2
Date: Sun, 3 Jan 2016 08:42:27 -0800
From: Sven Oehme <oehmes at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Resource exhausted by Pool Migration
Message-ID:
<CALssuR2z4nzTxcTN0nJTF0u0OXSiCekcQ
+DHUzMfTnGUnLU58g at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Danny,
can you be a bit more specific, which resources get exhausted ?
are you talking about operating system or Spectrum Scale resources
(filecache or pagepool) ?
when you migrate the files ( i assume policy engine) did you specify which
nodes do the migration ( -N hostnames) or did you just run mmapplypolicy
without anything ?
can you post either your entire mmlsconfig or at least output of :
for i in maxFilesToCache pagepool maxStatCache nsdMinWorkerThreads
nsdMaxWorkerThreads worker1Threads; do mmlsconfig $i ; done
mmlsfs , mmlsnsd and mmlscluster output might be useful too..
sven
On Sun, Jan 3, 2016 at 7:55 AM, Danny Alexander Calderon Rodriguez <
dacalder at co.ibm.com> wrote:
> HI All
>
> Actually I have a 4.2 Spectrum Scale cluster with protocol service, we
are managing small files (32K to 140K), when I try to migrate some files
(120.000 files ) the system resources of all nodes is exhausted and the
protocol nodes don't get services to client.
>
> I wan to ask if there is any way to limit the resources consuming at the
migration time?
>
>
> Thanks to all
>
>
>
> Enviado desde IBM Verse
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/d5f4999c/attachment-0001.html
>
------------------------------
Message: 3
Date: Sun, 3 Jan 2016 21:56:26 +0000
From: "Simon Thompson (Research Computing - IT Services)"
<S.J.Thompson at bham.ac.uk>
To: "gpfsug-discuss at spectrumscale.org"
<gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] metadata replication question
Message-ID:
<CF45EE16DEF2FE4B9AA7FF2B6EE26545D16BD321 at EX13.adf.bham.ac.uk>
Content-Type: text/plain; charset="us-ascii"
I currently have 4 NSD servers in a cluster, two pairs in two data centres.
Data and metadata replication is currently set to 2 with metadata sitting
on sas drivers in a storewise array. I also have a vm floating between the
two data centres to guarantee quorum in one only in the event of split
brain.
Id like to add some ssd for metadata.
Should I:
Add raid1 ssd to the storewise?
Add local ssd to the nsd servers?
If I did the second, should I
add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
add ssd to each nsd server as raid 1, use the same failure group for each
data centre pair?
add ssd to each nsd server not raid 1, use the dame failure group for each
data centre pair?
Or something else entirely?
What I want so survive is a split data centre situation or failure of a
single nsd server at any point...
Thoughts? Comments?
I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?
Id like a cost competitive solution that gives faster performance than the
current sas drives.
Was also thinking I might add an ssd to each nsd server for system.log pool
for hawc as well...
Thanks
Simon
------------------------------
Message: 4
Date: Sun, 3 Jan 2016 22:10:21 +0000
From: Barry Evans <bevans at pixitmedia.com>
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] metadata replication question
Message-ID: <56899C4D.4050907 at pixitmedia.com>
Content-Type: text/plain; charset="windows-1252"; Format="flowed"
Can all 4 NSD servers see all existing storwize arrays across both DC's?
Cheers,
Barry
On 03/01/2016 21:56, Simon Thompson (Research Computing - IT Services)
wrote:
> I currently have 4 NSD servers in a cluster, two pairs in two data
centres. Data and metadata replication is currently set to 2 with metadata
sitting on sas drivers in a storewise array. I also have a vm floating
between the two data centres to guarantee quorum in one only in the event
of split brain.
>
> Id like to add some ssd for metadata.
>
> Should I:
>
> Add raid1 ssd to the storewise?
>
> Add local ssd to the nsd servers?
>
> If I did the second, should I
> add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
> add ssd to each nsd server as raid 1, use the same failure group for
each data centre pair?
> add ssd to each nsd server not raid 1, use the dame failure group for
each data centre pair?
>
> Or something else entirely?
>
> What I want so survive is a split data centre situation or failure of a
single nsd server at any point...
>
> Thoughts? Comments?
>
> I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?
>
> Id like a cost competitive solution that gives faster performance than
the current sas drives.
>
> Was also thinking I might add an ssd to each nsd server for system.log
pool for hawc as well...
>
> Thanks
>
> Simon
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Barry Evans
Technical Director & Co-Founder
Pixit Media
Mobile: +44 (0)7950 666 248
http://www.pixitmedia.com
--
This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160103/5d463c6d/attachment-0001.html
>
------------------------------
Message: 5
Date: Sun, 3 Jan 2016 22:18:24 +0000
From: "Simon Thompson (Research Computing - IT Services)"
<S.J.Thompson at bham.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] metadata replication question
Message-ID:
<CF45EE16DEF2FE4B9AA7FF2B6EE26545D16BD363 at EX13.adf.bham.ac.uk>
Content-Type: text/plain; charset="us-ascii"
Yes there is extended san in place. The failure groups for the storage are
different in each dc so we guarantee that the data replication has 1 copy
per dc.
Simon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org
[gpfsug-discuss-bounces at spectrumscale.org] on behalf of Barry Evans
[bevans at pixitmedia.com]
Sent: 03 January 2016 22:10
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] metadata replication question
Can all 4 NSD servers see all existing storwize arrays across both DC's?
Cheers,
Barry
On 03/01/2016 21:56, Simon Thompson (Research Computing - IT Services)
wrote:
I currently have 4 NSD servers in a cluster, two pairs in two data centres.
Data and metadata replication is currently set to 2 with metadata sitting
on sas drivers in a storewise array. I also have a vm floating between the
two data centres to guarantee quorum in one only in the event of split
brain.
Id like to add some ssd for metadata.
Should I:
Add raid1 ssd to the storewise?
Add local ssd to the nsd servers?
If I did the second, should I
add ssd to each nsd server (not raid 1) and set each in a different
failure group and make metadata replication of 4.
add ssd to each nsd server as raid 1, use the same failure group for each
data centre pair?
add ssd to each nsd server not raid 1, use the dame failure group for each
data centre pair?
Or something else entirely?
What I want so survive is a split data centre situation or failure of a
single nsd server at any point...
Thoughts? Comments?
I'm thinking the first of the nsd local options uses 4 writes as does the
second, but each nsd server then has a local copy of the metatdata locally
and ssd fails, in which case it should be able to get it from its local
partner pair anyway (with readlocalreplica)?
Id like a cost competitive solution that gives faster performance than the
current sas drives.
Was also thinking I might add an ssd to each nsd server for system.log pool
for hawc as well...
Thanks
Simon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Barry Evans
Technical Director & Co-Founder
Pixit Media
Mobile: +44 (0)7950 666 248
http://www.pixitmedia.com
[http://www.pixitmedia.com/sig/sig-cio.jpg]
This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other person.
Please notify the sender immediately and delete this email from your
computer system. Any opinions expressed are not necessarily those of the
company from which this email was sent and, whilst to the best of our
knowledge no viruses or defects exist, no responsibility can be accepted
for any loss or damage arising from its receipt or subsequent use of this
email.
------------------------------
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
End of gpfsug-discuss Digest, Vol 48, Issue 2
*********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0F620571.gif
Type: image/gif
Size: 360 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160104/14362529/attachment-0003.gif>
More information about the gpfsug-discuss
mailing list