From u.sibiller at science-computing.de  Tue Jun  1 15:26:10 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Tue, 1 Jun 2021 16:26:10 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
Message-ID: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>

Hi,

I experience some strangeness that I fail to understand completely. I have a fileset that got copied 
(rsynced) from one cluster to another. The reported size (mmrepquota) of the source filesystem is 
800G (and due to data and metadata replication being set to 2 this effectively means 400G). After 
syncing the data to the destination the size there is ~457GB.

$ mmrepquota -j srcfilesys | grep fileset
srcfileset FILESET         800        800        800          0     none |      863       0        0 
        0     none

$ mmrepquota -j dstfilesys | grep fileset
fileset root       FILESET         457        400        400          0     none |      853       0 
       0        0     none

(note: on the destination filesystem we have set

ignoreReplicationOnStatfs yes
IgnoreReplicaSpaceOnStat yes
ignoreReplicationForQuota yes

so there's no need to to divisions
)

(note2: the destination hat 10 files less. These where small leftover .nfs* files:
$ du --block-size 1 /srcfilesys/fileset/.nfs*
1024    /srvfilesys/fileset//.nfs0000000000f8a57800000505
1024    /srvfilesys/fileset//.nfs0000000002808f4d000000cb
1024    /srvfilesys/fileset//.nfs0000000002af44db00005509
1024    /srvfilesys/fileset//.nfs00000000034eb9270000072a
1024    /srvfilesys/fileset//.nfs0000000003a9b48300002974
1024    /srvfilesys/fileset//.nfs0000000003d10f990000028a
$ du --apparent-size --block-size 1 /srcfilesys/fileset/.nfs*
524     /srvfilesys/fileset//.nfs0000000000f8a57800000505
524     /srvfilesys/fileset//.nfs0000000002808f4d000000cb
524     /srvfilesys/fileset//.nfs0000000002af44db00005509
524     /srvfilesys/fileset//.nfs00000000034eb9270000072a
524     /srvfilesys/fileset//.nfs0000000003a9b48300002974
524     /srvfilesys/fileset//.nfs0000000003d10f990000028a
)

While trying to understand what's going on here I found this on the source file system (which is 
valid for all files, with different number of course):

$ du --block-size 1 /srcfilesys/fileset/filename
65536   /srcfilesys/fileset/filename

$ du --apparent-size --block-size 1 /srcfilesys/fileset/filename
3994    /srcfilesys/fileset/filename

$ stat /srcfilesys/fileset/filename
   File: ?/srcfilesys/fileset/filename?
   Size: 3994            Blocks: 128        IO Block: 1048576 regular file
Device: 2ah/42d Inode: 23266095    Links: 1
Access: (0660/-rw-rw----)  Uid: (73018/ cpunnoo)   Gid: (50070/  dc-rti)
Context: system_u:object_r:unlabeled_t:s0
Access: 2021-05-12 20:10:13.814459305 +0200
Modify: 2020-07-16 11:08:41.631006000 +0200
Change: 2020-07-16 11:08:41.630896177 +0200
  Birth: -


If I sum up the disk usage of the first du I end up with 799.986G in total which matches the 
mmrepquota output. If I add up the disk usage of the second du I end up at 456.569G _which matches 
the mmrepquota output on the destination system_.

So on the source filesystem the quota seems to add up the apparent size while on the destination 
filesystem the quota value is the sum of the du without --apparent-size.

Running the dus on the destination filesystem reports other numbers:

$ du --block-size 1 /dstfilesys/fileset/filename
8192    /dstfilesys/fileset/filename
$ du --apparent-size --block-size 1 /dstfilesys/fileset/filename
3994    /dstfilesys/fileset/filename
$ stat /dstfilesys/fileset/filename
   File: /dstfilesys/fileset/filename
   Size: 3994            Blocks: 16         IO Block: 4194304 regular file
Device: 3dh/61d Inode: 2166358719  Links: 1
Access: (0660/-rw-rw----)  Uid: (73018/ cpunnoo)   Gid: (50070/  dc-rti)
Access: 2021-05-29 07:52:56.069440382 +0200
Modify: 2020-07-16 11:08:41.631006000 +0200
Change: 2021-05-12 20:10:13.970443145 +0200
  Birth: -

Summing them up shows almost identical numbers for both dus: 467528 467527 which I really do not get 
at all...

So is there an explanation of how mmrepquota and du and du --apparent-size are related?

Uli


PS: Some more details:


The source filesystem is RHEL7 with  gpfs.base 5.0.5-5:
flag                value                    description
------------------- ------------------------ -----------------------------------
  -f                 32768                    Minimum fragment (subblock) size in bytes
  -i                 4096                     Inode size in bytes
  -I                 32768                    Indirect block size in bytes
  -m                 2                        Default number of metadata replicas
  -M                 2                        Maximum number of metadata replicas
  -r                 2                        Default number of data replicas
  -R                 2                        Maximum number of data replicas
  -j                 scatter                  Block allocation type
  -D                 nfs4                     File locking semantics in effect
  -k                 all                      ACL semantics in effect
  -n                 32                       Estimated number of nodes that will mount file system
  -B                 1048576                  Block size
  -Q                 user;group;fileset       Quotas accounting enabled
                     user;group;fileset       Quotas enforced
                     none                     Default quotas enabled
  --perfileset-quota No                       Per-fileset quota enforcement
  --filesetdf        Yes                      Fileset df enabled?
  -V                 16.00 (4.2.2.0)          Current file system version
                     14.10 (4.1.0.4)          Original file system version
  --create-time      Tue Feb  3 11:46:10 2015 File system creation time
  -z                 No                       Is DMAPI enabled?
  -L                 4194304                  Logfile size
  -E                 Yes                      Exact mtime mount option
  -S                 No                       Suppress atime mount option
  -K                 whenpossible             Strict replica allocation option
  --fastea           Yes                      Fast external attributes enabled?
  --encryption       No                       Encryption enabled?
  --inode-limit      179217920                Maximum number of inodes in all inode spaces
  --log-replicas     0                        Number of log replicas
  --is4KAligned      Yes                      is4KAligned?
  --rapid-repair     No                       rapidRepair enabled?
  --write-cache-threshold 0                   HAWC Threshold (max 65536)
  --subblocks-per-full-block 32               Number of subblocks per full block
  -P                 system                   Disk storage pools in file system
  --file-audit-log   No                       File Audit Logging enabled?
  --maintenance-mode No                       Maintenance Mode enabled?
  -d 
NSDDCS1N01;NSDDCS1N02;NSDDCS1N03;NSDDCS1N04;NSDDCS1N05;NSDDCS1N06;NSDDCS1N07;NSDDCS1N08;NSDDCS1N09;NSDDCS1N10;NSDDCS1N11;NSDDCS1N12;NSDNECE54001N01;NSDNECE54001N02;
  -d 
NSDNECE54001N03;NSDNECE54001N04;NSDNECE54001N05;NSDNECE54001N06;NSDNECE54001N07;NSDNECE54001N08;NSDNECE54001N09;NSDNECE54001N10;NSDNECE54001N11;NSDNECE54001N12;DESC1 
  Disks in file system
  -A                 yes                      Automatic mount option
  -o                 none                     Additional mount options
  -T                 /srcfilesys              Default mount point
  --mount-priority   0                        Mount priority


The destination cluster is RHEL8 with gpfs.base-5.1.0-3.x86_64:
flag                value                    description
------------------- ------------------------ -----------------------------------
  -f                 8192                     Minimum fragment (subblock) size in bytes
  -i                 4096                     Inode size in bytes
  -I                 32768                    Indirect block size in bytes
  -m                 2                        Default number of metadata replicas
  -M                 2                        Maximum number of metadata replicas
  -r                 2                        Default number of data replicas
  -R                 2                        Maximum number of data replicas
  -j                 scatter                  Block allocation type
  -D                 nfs4                     File locking semantics in effect
  -k                 all                      ACL semantics in effect
  -n                 32                       Estimated number of nodes that will mount file system
  -B                 4194304                  Block size
  -Q                 user;group;fileset       Quotas accounting enabled
                     none                     Quotas enforced
                     none                     Default quotas enabled
  --perfileset-quota yes                      Per-fileset quota enforcement
  --filesetdf        yes                      Fileset df enabled?
  -V                 23.00 (5.0.5.0)          File system version
  --create-time      Tue May 11 16:51:27 2021 File system creation time
  -z                 no                       Is DMAPI enabled?
  -L                 33554432                 Logfile size
  -E                 yes                      Exact mtime mount option
  -S                 relatime                 Suppress atime mount option
  -K                 whenpossible             Strict replica allocation option
  --fastea           yes                      Fast external attributes enabled?
  --encryption       no                       Encryption enabled?
  --inode-limit      105259008                Maximum number of inodes in all inode spaces
  --log-replicas     0                        Number of log replicas
  --is4KAligned      yes                      is4KAligned?
  --rapid-repair     yes                      rapidRepair enabled?
  --write-cache-threshold 0                   HAWC Threshold (max 65536)
  --subblocks-per-full-block 512              Number of subblocks per full block
  -P                 system                   Disk storage pools in file system
  --file-audit-log   no                       File Audit Logging enabled?
  --maintenance-mode no                       Maintenance Mode enabled?
  -d                 RG001VS021;RG002VS021;RG001VS022;RG002VS022  Disks in file system
  -A                 yes                      Automatic mount option
  -o                 none                     Additional mount options
  -T                 /dstfilesys              Default mount point
  --mount-priority   0                        Mount priority

-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From tortay at cc.in2p3.fr  Tue Jun  1 15:56:42 2021
From: tortay at cc.in2p3.fr (Loic Tortay)
Date: Tue, 1 Jun 2021 16:56:42 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
Message-ID: <33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>

On 6/1/21 4:26 PM, Ulrich Sibiller wrote:
[...]
> )
> 
> While trying to understand what's going on here I found this on the
> source file system (which is valid for all files, with different number
> of course):
> 
> $ du --block-size 1 /srcfilesys/fileset/filename
> 65536?? /srcfilesys/fileset/filename
> 
> $ du --apparent-size --block-size 1 /srcfilesys/fileset/filename
> 3994??? /srcfilesys/fileset/filename
> 
> $ stat /srcfilesys/fileset/filename
> ? File: ?/srcfilesys/fileset/filename?
> ? Size: 3994??????????? Blocks: 128??????? IO Block: 1048576 regular file
> Device: 2ah/42d Inode: 23266095??? Links: 1
> Access: (0660/-rw-rw----)? Uid: (73018/ cpunnoo)?? Gid: (50070/? dc-rti)
> Context: system_u:object_r:unlabeled_t:s0
> Access: 2021-05-12 20:10:13.814459305 +0200
> Modify: 2020-07-16 11:08:41.631006000 +0200
> Change: 2020-07-16 11:08:41.630896177 +0200
> ?Birth: -
> 
Hello,
This looks like the sub-block overhead.
If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block
size implies 32 kiB sub blocks (32 sub-blocks).
The sub-block is the minimum disk allocation for files (if the file
content is too large to be kept in the inode, when that is supported on
the specific GPFS filesystem).

The "Blocks" value displayed by "stat" is in 512 bytes unit, so 128*512
= 65536 (which is consistent with "du"): two 32 kiB sub-blocks due to
data replication.

The "--apparent-size" option to "du" uses the user visible size not the
actual disk usage (per the man page), so 3994 is also consistent w/
"stat" output.

AFAIK, GPFS space quotas count the sub-blocks not the apparent sizes, so
again this would be consistent with the overhead.

Beside the overhead, hard-links in the source FS (which, if I'm not
mistaken, are not handled by "rsync" unless you specify "-H") and in
some cases spare files can also explain the differences.


Lo?c.
-- 
|       Lo?c Tortay <tortay at cc.in2p3.fr> - IN2P3 Computing Centre      |


From tortay at cc.in2p3.fr  Tue Jun  1 15:56:42 2021
From: tortay at cc.in2p3.fr (Loic Tortay)
Date: Tue, 1 Jun 2021 16:56:42 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
Message-ID: <33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>

On 6/1/21 4:26 PM, Ulrich Sibiller wrote:
[...]
> )
> 
> While trying to understand what's going on here I found this on the
> source file system (which is valid for all files, with different number
> of course):
> 
> $ du --block-size 1 /srcfilesys/fileset/filename
> 65536?? /srcfilesys/fileset/filename
> 
> $ du --apparent-size --block-size 1 /srcfilesys/fileset/filename
> 3994??? /srcfilesys/fileset/filename
> 
> $ stat /srcfilesys/fileset/filename
> ? File: ?/srcfilesys/fileset/filename?
> ? Size: 3994??????????? Blocks: 128??????? IO Block: 1048576 regular file
> Device: 2ah/42d Inode: 23266095??? Links: 1
> Access: (0660/-rw-rw----)? Uid: (73018/ cpunnoo)?? Gid: (50070/? dc-rti)
> Context: system_u:object_r:unlabeled_t:s0
> Access: 2021-05-12 20:10:13.814459305 +0200
> Modify: 2020-07-16 11:08:41.631006000 +0200
> Change: 2020-07-16 11:08:41.630896177 +0200
> ?Birth: -
> 
Hello,
This looks like the sub-block overhead.
If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block
size implies 32 kiB sub blocks (32 sub-blocks).
The sub-block is the minimum disk allocation for files (if the file
content is too large to be kept in the inode, when that is supported on
the specific GPFS filesystem).

The "Blocks" value displayed by "stat" is in 512 bytes unit, so 128*512
= 65536 (which is consistent with "du"): two 32 kiB sub-blocks due to
data replication.

The "--apparent-size" option to "du" uses the user visible size not the
actual disk usage (per the man page), so 3994 is also consistent w/
"stat" output.

AFAIK, GPFS space quotas count the sub-blocks not the apparent sizes, so
again this would be consistent with the overhead.

Beside the overhead, hard-links in the source FS (which, if I'm not
mistaken, are not handled by "rsync" unless you specify "-H") and in
some cases spare files can also explain the differences.


Lo?c.
-- 
|       Lo?c Tortay <tortay at cc.in2p3.fr> - IN2P3 Computing Centre      |


From krajaram at geocomputing.net  Tue Jun  1 17:08:51 2021
From: krajaram at geocomputing.net (Kumaran Rajaram)
Date: Tue, 1 Jun 2021 16:08:51 +0000
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
Message-ID: <SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>

Hi,

>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).

Just to add: The /srcfilesys seemed to have been created with GPFS version 4.x which supports only 32 sub-blocks per block.

-T                 /srcfilesys              Default mount point
-V                 16.00 (4.2.2.0)          Current file system version
                   14.10 (4.1.0.4)          Original file system version
--create-time      Tue Feb  3 11:46:10 2015 File system creation time
-B                 1048576                  Block size
-f                 32768                    Minimum fragment (subblock) size in bytes
--subblocks-per-full-block 32               Number of subblocks per full block


The /dstfilesys was created with GPFS version 5.x which support greater than 32 subblocks per block. /dstfilesys does have 512 subblocks-per-full-block with 8KiB subblock size since file-system blocksize is 4MiB. 


-T                 /dstfilesys              Default mount point
-V                 23.00 (5.0.5.0)          File system version
--create-time      Tue May 11 16:51:27 2021 File system creation time
-B                 4194304                  Block size
-f                 8192                     Minimum fragment (subblock) size in bytes
--subblocks-per-full-block 512              Number of subblocks per full block

Hope this helps,
-Kums


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Loic Tortay
Sent: Tuesday, June 1, 2021 10:57 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>; Ulrich Sibiller <u.sibiller at science-computing.de>; gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] du --apparent-size and quota

On 6/1/21 4:26 PM, Ulrich Sibiller wrote:
[...]
> )
> 
> While trying to understand what's going on here I found this on the 
> source file system (which is valid for all files, with different 
> number of course):
> 
> $ du --block-size 1 /srcfilesys/fileset/filename
> 65536?? /srcfilesys/fileset/filename
> 
> $ du --apparent-size --block-size 1 /srcfilesys/fileset/filename
> 3994??? /srcfilesys/fileset/filename
> 
> $ stat /srcfilesys/fileset/filename
> ? File: ?/srcfilesys/fileset/filename?
> ? Size: 3994??????????? Blocks: 128??????? IO Block: 1048576 regular 
> file
> Device: 2ah/42d Inode: 23266095??? Links: 1
> Access: (0660/-rw-rw----)? Uid: (73018/ cpunnoo)?? Gid: (50070/? 
> dc-rti)
> Context: system_u:object_r:unlabeled_t:s0
> Access: 2021-05-12 20:10:13.814459305 +0200
> Modify: 2020-07-16 11:08:41.631006000 +0200
> Change: 2020-07-16 11:08:41.630896177 +0200
> ?Birth: -
> 
Hello,
This looks like the sub-block overhead.
If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).
The sub-block is the minimum disk allocation for files (if the file content is too large to be kept in the inode, when that is supported on the specific GPFS filesystem).

The "Blocks" value displayed by "stat" is in 512 bytes unit, so 128*512 = 65536 (which is consistent with "du"): two 32 kiB sub-blocks due to data replication.

The "--apparent-size" option to "du" uses the user visible size not the actual disk usage (per the man page), so 3994 is also consistent w/ "stat" output.

AFAIK, GPFS space quotas count the sub-blocks not the apparent sizes, so again this would be consistent with the overhead.

Beside the overhead, hard-links in the source FS (which, if I'm not mistaken, are not handled by "rsync" unless you specify "-H") and in some cases spare files can also explain the differences.


Lo?c.
-- 
|       Lo?c Tortay <tortay at cc.in2p3.fr> - IN2P3 Computing Centre      |
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

From krajaram at geocomputing.net  Tue Jun  1 17:08:51 2021
From: krajaram at geocomputing.net (Kumaran Rajaram)
Date: Tue, 1 Jun 2021 16:08:51 +0000
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
Message-ID: <SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>

Hi,

>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).

Just to add: The /srcfilesys seemed to have been created with GPFS version 4.x which supports only 32 sub-blocks per block.

-T                 /srcfilesys              Default mount point
-V                 16.00 (4.2.2.0)          Current file system version
                   14.10 (4.1.0.4)          Original file system version
--create-time      Tue Feb  3 11:46:10 2015 File system creation time
-B                 1048576                  Block size
-f                 32768                    Minimum fragment (subblock) size in bytes
--subblocks-per-full-block 32               Number of subblocks per full block


The /dstfilesys was created with GPFS version 5.x which support greater than 32 subblocks per block. /dstfilesys does have 512 subblocks-per-full-block with 8KiB subblock size since file-system blocksize is 4MiB. 


-T                 /dstfilesys              Default mount point
-V                 23.00 (5.0.5.0)          File system version
--create-time      Tue May 11 16:51:27 2021 File system creation time
-B                 4194304                  Block size
-f                 8192                     Minimum fragment (subblock) size in bytes
--subblocks-per-full-block 512              Number of subblocks per full block

Hope this helps,
-Kums


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Loic Tortay
Sent: Tuesday, June 1, 2021 10:57 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>; Ulrich Sibiller <u.sibiller at science-computing.de>; gpfsug-discuss at gpfsug.org
Subject: Re: [gpfsug-discuss] du --apparent-size and quota

On 6/1/21 4:26 PM, Ulrich Sibiller wrote:
[...]
> )
> 
> While trying to understand what's going on here I found this on the 
> source file system (which is valid for all files, with different 
> number of course):
> 
> $ du --block-size 1 /srcfilesys/fileset/filename
> 65536?? /srcfilesys/fileset/filename
> 
> $ du --apparent-size --block-size 1 /srcfilesys/fileset/filename
> 3994??? /srcfilesys/fileset/filename
> 
> $ stat /srcfilesys/fileset/filename
> ? File: ?/srcfilesys/fileset/filename?
> ? Size: 3994??????????? Blocks: 128??????? IO Block: 1048576 regular 
> file
> Device: 2ah/42d Inode: 23266095??? Links: 1
> Access: (0660/-rw-rw----)? Uid: (73018/ cpunnoo)?? Gid: (50070/? 
> dc-rti)
> Context: system_u:object_r:unlabeled_t:s0
> Access: 2021-05-12 20:10:13.814459305 +0200
> Modify: 2020-07-16 11:08:41.631006000 +0200
> Change: 2020-07-16 11:08:41.630896177 +0200
> ?Birth: -
> 
Hello,
This looks like the sub-block overhead.
If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).
The sub-block is the minimum disk allocation for files (if the file content is too large to be kept in the inode, when that is supported on the specific GPFS filesystem).

The "Blocks" value displayed by "stat" is in 512 bytes unit, so 128*512 = 65536 (which is consistent with "du"): two 32 kiB sub-blocks due to data replication.

The "--apparent-size" option to "du" uses the user visible size not the actual disk usage (per the man page), so 3994 is also consistent w/ "stat" output.

AFAIK, GPFS space quotas count the sub-blocks not the apparent sizes, so again this would be consistent with the overhead.

Beside the overhead, hard-links in the source FS (which, if I'm not mistaken, are not handled by "rsync" unless you specify "-H") and in some cases spare files can also explain the differences.


Lo?c.
-- 
|       Lo?c Tortay <tortay at cc.in2p3.fr> - IN2P3 Computing Centre      |
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

From damir.krstic at gmail.com  Tue Jun  1 17:48:26 2021
From: damir.krstic at gmail.com (Damir Krstic)
Date: Tue, 1 Jun 2021 11:48:26 -0500
Subject: [gpfsug-discuss] CVE-2021-29740
Message-ID: <CAKV+Wqf2mWH_y7K4djGdOwebm-8Ece_EYZ2bPMt3AB4thV2u7Q@mail.gmail.com>

IBM posted a security bulletin for the spectrum scale (CVE-2021-29740). Not
a lot of detail provided in that bulletin. Has anyone installed this fix?
Does anyone have more information about it?

Thanks,
Damir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210601/32db08e3/attachment.htm>

From knop at us.ibm.com  Wed Jun  2 04:51:57 2021
From: knop at us.ibm.com (Felipe Knop)
Date: Wed, 2 Jun 2021 03:51:57 +0000
Subject: [gpfsug-discuss] CVE-2021-29740
In-Reply-To: <CAKV+Wqf2mWH_y7K4djGdOwebm-8Ece_EYZ2bPMt3AB4thV2u7Q@mail.gmail.com>
References: <CAKV+Wqf2mWH_y7K4djGdOwebm-8Ece_EYZ2bPMt3AB4thV2u7Q@mail.gmail.com>
Message-ID: <OF0B09B807.7174F674-ON002586E8.0013DA54-002586E8.00153CA3@ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/66a1eac1/attachment.htm>

From u.sibiller at science-computing.de  Wed Jun  2 11:16:09 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 2 Jun 2021 12:16:09 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
Message-ID: <9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>

On 6/1/21 6:08 PM, Kumaran Rajaram wrote:
>>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).
> 
> Just to add: The /srcfilesys seemed to have been created with GPFS version 4.x which supports only 32 sub-blocks per block.
> 
> -T                 /srcfilesys              Default mount point
> -V                 16.00 (4.2.2.0)          Current file system version
>                     14.10 (4.1.0.4)          Original file system version
> --create-time      Tue Feb  3 11:46:10 2015 File system creation time
> -B                 1048576                  Block size
> -f                 32768                    Minimum fragment (subblock) size in bytes
> --subblocks-per-full-block 32               Number of subblocks per full block
> 
> 
> The /dstfilesys was created with GPFS version 5.x which support greater than 32 subblocks per block. /dstfilesys does have 512 subblocks-per-full-block with 8KiB subblock size since file-system blocksize is 4MiB.
> 
> 
> -T                 /dstfilesys              Default mount point
> -V                 23.00 (5.0.5.0)          File system version
> --create-time      Tue May 11 16:51:27 2021 File system creation time
> -B                 4194304                  Block size
> -f                 8192                     Minimum fragment (subblock) size in bytes
> --subblocks-per-full-block 512              Number of subblocks per full block
> 

Well, from the higher flexibility in terms of the number of subblocks I'd expect a lower disk usage 
instead of a higher one. Is this a wrong assumption? From 400G to 457G it's a ~13% increase!

> Beside the overhead, hard-links in the source FS (which, if I'm not mistaken, are not handled by "rsync" unless you specify "-H") and in some cases spare files can also explain the differences.

My rsync is using -AHS, so this should not be relevant here.

Uli

-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From u.sibiller at science-computing.de  Wed Jun  2 11:16:09 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 2 Jun 2021 12:16:09 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
Message-ID: <9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>

On 6/1/21 6:08 PM, Kumaran Rajaram wrote:
>>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size implies 32 kiB sub blocks (32 sub-blocks).
> 
> Just to add: The /srcfilesys seemed to have been created with GPFS version 4.x which supports only 32 sub-blocks per block.
> 
> -T                 /srcfilesys              Default mount point
> -V                 16.00 (4.2.2.0)          Current file system version
>                     14.10 (4.1.0.4)          Original file system version
> --create-time      Tue Feb  3 11:46:10 2015 File system creation time
> -B                 1048576                  Block size
> -f                 32768                    Minimum fragment (subblock) size in bytes
> --subblocks-per-full-block 32               Number of subblocks per full block
> 
> 
> The /dstfilesys was created with GPFS version 5.x which support greater than 32 subblocks per block. /dstfilesys does have 512 subblocks-per-full-block with 8KiB subblock size since file-system blocksize is 4MiB.
> 
> 
> -T                 /dstfilesys              Default mount point
> -V                 23.00 (5.0.5.0)          File system version
> --create-time      Tue May 11 16:51:27 2021 File system creation time
> -B                 4194304                  Block size
> -f                 8192                     Minimum fragment (subblock) size in bytes
> --subblocks-per-full-block 512              Number of subblocks per full block
> 

Well, from the higher flexibility in terms of the number of subblocks I'd expect a lower disk usage 
instead of a higher one. Is this a wrong assumption? From 400G to 457G it's a ~13% increase!

> Beside the overhead, hard-links in the source FS (which, if I'm not mistaken, are not handled by "rsync" unless you specify "-H") and in some cases spare files can also explain the differences.

My rsync is using -AHS, so this should not be relevant here.

Uli

-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From jonathan.buzzard at strath.ac.uk  Wed Jun  2 12:09:33 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Wed, 2 Jun 2021 12:09:33 +0100
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
Message-ID: <eee43b58-23bf-0b4e-b3dc-330ed872ef0b@strath.ac.uk>

On 02/06/2021 11:16, Ulrich Sibiller wrote:

[SNIP]

> 
> My rsync is using -AHS, so this should not be relevant here.
> 

I wonder have you done more than one rsync? If so are you using --delete?

If not and the source fileset has changed then you will be accumulating 
files at the destination and it would explain the larger size.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From u.sibiller at science-computing.de  Wed Jun  2 12:44:08 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 2 Jun 2021 13:44:08 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <eee43b58-23bf-0b4e-b3dc-330ed872ef0b@strath.ac.uk>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
	<eee43b58-23bf-0b4e-b3dc-330ed872ef0b@strath.ac.uk>
Message-ID: <f0cd30ba-f34a-7269-aed0-679694645d52@science-computing.de>

On 6/2/21 1:09 PM, Jonathan Buzzard wrote:
>> My rsync is using -AHS, so this should not be relevant here.
> 
> I wonder have you done more than one rsync? If so are you using --delete?
> 
> If not and the source fileset has changed then you will be accumulating
> files at the destination and it would explain the larger size.

Yes, of course I have been using -delete (and -delete-excluded) ;-)

Uli
-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From dugan at bu.edu  Wed Jun  2 13:22:55 2021
From: dugan at bu.edu (Dugan, Michael J)
Date: Wed, 2 Jun 2021 12:22:55 +0000
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <f0cd30ba-f34a-7269-aed0-679694645d52@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
	<eee43b58-23bf-0b4e-b3dc-330ed872ef0b@strath.ac.uk>,
	<f0cd30ba-f34a-7269-aed0-679694645d52@science-computing.de>
Message-ID: <BL0PR03MB4036B82A8E33C90339A36D62CE3D9@BL0PR03MB4036.namprd03.prod.outlook.com>

Do you have sparse files on the first filesystem? Since the second filesystem
has a larger blocksize than the first one, the copied file may not be sparse on the
second filesystem. I think gpfs only supports holes that line up will a full filesystem
block.

--Mike Dugan


--
Michael J. Dugan
Manager of Systems Programming and Administration
Research Computing Services | IS&T | Boston University
617-358-0030
dugan at bu.edu
http://www.bu.edu/tech


________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Ulrich Sibiller <u.sibiller at science-computing.de>
Sent: Wednesday, June 2, 2021 7:44 AM
To: gpfsug-discuss at spectrumscale.org <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] du --apparent-size and quota

On 6/2/21 1:09 PM, Jonathan Buzzard wrote:
>> My rsync is using -AHS, so this should not be relevant here.
>
> I wonder have you done more than one rsync? If so are you using --delete?
>
> If not and the source fileset has changed then you will be accumulating
> files at the destination and it would explain the larger size.

Yes, of course I have been using -delete (and -delete-excluded) ;-)

Uli
--
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/7fd02b91/attachment.htm>

From scale at us.ibm.com  Wed Jun  2 15:12:52 2021
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 2 Jun 2021 22:12:52 +0800
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
Message-ID: <OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>


Hi,
The data and metadata replications are 2 on both source and destination
filesystems, so from:

$ mmrepquota -j srcfilesys | grep fileset
srcfileset FILESET         800        800        800          0     none |
863       0        0
        0     none

$ mmrepquota -j dstfilesys | grep fileset
fileset root       FILESET         457        400        400          0
none |      853       0
       0        0     none

the quota data should be changed from 800G to 457G (or 400G to 228.5G),
after "rsync -AHS".


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	Ulrich Sibiller <u.sibiller at science-computing.de>
To:	Kumaran Rajaram <krajaram at geocomputing.net>, gpfsug main
            discussion list <gpfsug-discuss at spectrumscale.org>,
            "gpfsug-discuss at gpfsug.org" <gpfsug-discuss at gpfsug.org>
Date:	06/02/2021 06:16 PM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] du --apparent-size and quota
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


On 6/1/21 6:08 PM, Kumaran Rajaram wrote:
>>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block
size implies 32 kiB sub blocks (32 sub-blocks).
>
> Just to add: The /srcfilesys seemed to have been created with GPFS
version 4.x which supports only 32 sub-blocks per block.
>
> -T                 /srcfilesys              Default mount point
> -V                 16.00 (4.2.2.0)          Current file system version
>                     14.10 (4.1.0.4)          Original file system version
> --create-time      Tue Feb  3 11:46:10 2015 File system creation time
> -B                 1048576                  Block size
> -f                 32768                    Minimum fragment (subblock)
size in bytes
> --subblocks-per-full-block 32               Number of subblocks per full
block
>
>
> The /dstfilesys was created with GPFS version 5.x which support greater
than 32 subblocks per block. /dstfilesys does have 512
subblocks-per-full-block with 8KiB subblock size since file-system
blocksize is 4MiB.
>
>
> -T                 /dstfilesys              Default mount point
> -V                 23.00 (5.0.5.0)          File system version
> --create-time      Tue May 11 16:51:27 2021 File system creation time
> -B                 4194304                  Block size
> -f                 8192                     Minimum fragment (subblock)
size in bytes
> --subblocks-per-full-block 512              Number of subblocks per full
block
>

Well, from the higher flexibility in terms of the number of subblocks I'd
expect a lower disk usage
instead of a higher one. Is this a wrong assumption? From 400G to 457G it's
a ~13% increase!

> Beside the overhead, hard-links in the source FS (which, if I'm not
mistaken, are not handled by "rsync" unless you specify "-H") and in some
cases spare files can also explain the differences.

My rsync is using -AHS, so this should not be relevant here.

Uli

--
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/011979e3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/011979e3/attachment.gif>

From scale at us.ibm.com  Wed Jun  2 15:12:52 2021
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 2 Jun 2021 22:12:52 +0800
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
Message-ID: <OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>


Hi,
The data and metadata replications are 2 on both source and destination
filesystems, so from:

$ mmrepquota -j srcfilesys | grep fileset
srcfileset FILESET         800        800        800          0     none |
863       0        0
        0     none

$ mmrepquota -j dstfilesys | grep fileset
fileset root       FILESET         457        400        400          0
none |      853       0
       0        0     none

the quota data should be changed from 800G to 457G (or 400G to 228.5G),
after "rsync -AHS".


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	Ulrich Sibiller <u.sibiller at science-computing.de>
To:	Kumaran Rajaram <krajaram at geocomputing.net>, gpfsug main
            discussion list <gpfsug-discuss at spectrumscale.org>,
            "gpfsug-discuss at gpfsug.org" <gpfsug-discuss at gpfsug.org>
Date:	06/02/2021 06:16 PM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] du --apparent-size and quota
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


On 6/1/21 6:08 PM, Kumaran Rajaram wrote:
>>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block
size implies 32 kiB sub blocks (32 sub-blocks).
>
> Just to add: The /srcfilesys seemed to have been created with GPFS
version 4.x which supports only 32 sub-blocks per block.
>
> -T                 /srcfilesys              Default mount point
> -V                 16.00 (4.2.2.0)          Current file system version
>                     14.10 (4.1.0.4)          Original file system version
> --create-time      Tue Feb  3 11:46:10 2015 File system creation time
> -B                 1048576                  Block size
> -f                 32768                    Minimum fragment (subblock)
size in bytes
> --subblocks-per-full-block 32               Number of subblocks per full
block
>
>
> The /dstfilesys was created with GPFS version 5.x which support greater
than 32 subblocks per block. /dstfilesys does have 512
subblocks-per-full-block with 8KiB subblock size since file-system
blocksize is 4MiB.
>
>
> -T                 /dstfilesys              Default mount point
> -V                 23.00 (5.0.5.0)          File system version
> --create-time      Tue May 11 16:51:27 2021 File system creation time
> -B                 4194304                  Block size
> -f                 8192                     Minimum fragment (subblock)
size in bytes
> --subblocks-per-full-block 512              Number of subblocks per full
block
>

Well, from the higher flexibility in terms of the number of subblocks I'd
expect a lower disk usage
instead of a higher one. Is this a wrong assumption? From 400G to 457G it's
a ~13% increase!

> Beside the overhead, hard-links in the source FS (which, if I'm not
mistaken, are not handled by "rsync" unless you specify "-H") and in some
cases spare files can also explain the differences.

My rsync is using -AHS, so this should not be relevant here.

Uli

--
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/011979e3/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/011979e3/attachment-0001.gif>

From juergen.hannappel at desy.de  Wed Jun  2 16:26:07 2021
From: juergen.hannappel at desy.de (Hannappel, Juergen)
Date: Wed, 2 Jun 2021 17:26:07 +0200 (CEST)
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
	<OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>
Message-ID: <191496938.44228914.1622647567054.JavaMail.zimbra@desy.de>

Hi, 
mmrepquota reports without the --block-size parameter the size in units of 1KiB, so (if no ill-advised copy-paste editing confuses us) we are not talking about 400GiB but 400KiB. 
With just 863 files (from the inode part of the repquota output) and therefore 0.5KiB/file on average that could be explained by the sub-block size(although many files should vanish in the inodes). 
If it's 400GiB in 863 files with 500MiB/File the subblock overhead would not matter at all! 

> From: "IBM Spectrum Scale" <scale at us.ibm.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Cc: gpfsug-discuss-bounces at spectrumscale.org, gpfsug-discuss at gpfsug.org
> Sent: Wednesday, 2 June, 2021 16:12:52
> Subject: Re: [gpfsug-discuss] du --apparent-size and quota

> Hi,
> The data and metadata replications are 2 on both source and destination
> filesystems, so from:

> $ mmrepquota -j srcfilesys | grep fileset
> srcfileset FILESET 800 800 800 0 none | 863 0 0
> 0 none

> $ mmrepquota -j dstfilesys | grep fileset
> fileset root FILESET 457 400 400 0 none | 853 0
> 0 0 none

> the quota data should be changed from 800G to 457G (or 400G to 228.5G), after
> "rsync -AHS".

> Regards, The Spectrum Scale (GPFS) team

> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum Scale (GPFS),
> then please post it to the public IBM developerWroks Forum at [
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
> |
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
> ] .

> If your query concerns a potential software error in Spectrum Scale (GPFS) and
> you have an IBM software maintenance contract please contact 1-800-237-5511 in
> the United States or your local IBM Service Center in other countries.

> The forum is informally monitored as time permits and should not be used for
> priority messages to the Spectrum Scale (GPFS) team.

> Ulrich Sibiller ---06/02/2021 06:16:22 PM---On 6/1/21 6:08 PM, Kumaran Rajaram
> wrote: >>> If I'm not mistaken even with SS5 created filesystems,

> From: Ulrich Sibiller <u.sibiller at science-computing.de>
> To: Kumaran Rajaram <krajaram at geocomputing.net>, gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org>, "gpfsug-discuss at gpfsug.org"
> <gpfsug-discuss at gpfsug.org>
> Date: 06/02/2021 06:16 PM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] du --apparent-size and quota
> Sent by: gpfsug-discuss-bounces at spectrumscale.org

> On 6/1/21 6:08 PM, Kumaran Rajaram wrote:
>>>> If I'm not mistaken even with SS5 created filesystems, 1 MiB FS block size
> >>> implies 32 kiB sub blocks (32 sub-blocks).

>> Just to add: The /srcfilesys seemed to have been created with GPFS version 4.x
> > which supports only 32 sub-blocks per block.

> > -T /srcfilesys Default mount point
> > -V 16.00 (4.2.2.0) Current file system version
> > 14.10 (4.1.0.4) Original file system version
> > --create-time Tue Feb 3 11:46:10 2015 File system creation time
> > -B 1048576 Block size
> > -f 32768 Minimum fragment (subblock) size in bytes
> > --subblocks-per-full-block 32 Number of subblocks per full block


>> The /dstfilesys was created with GPFS version 5.x which support greater than 32
>> subblocks per block. /dstfilesys does have 512 subblocks-per-full-block with
> > 8KiB subblock size since file-system blocksize is 4MiB.


> > -T /dstfilesys Default mount point
> > -V 23.00 (5.0.5.0) File system version
> > --create-time Tue May 11 16:51:27 2021 File system creation time
> > -B 4194304 Block size
> > -f 8192 Minimum fragment (subblock) size in bytes
> > --subblocks-per-full-block 512 Number of subblocks per full block


> Well, from the higher flexibility in terms of the number of subblocks I'd expect
> a lower disk usage
> instead of a higher one. Is this a wrong assumption? From 400G to 457G it's a
> ~13% increase!

>> Beside the overhead, hard-links in the source FS (which, if I'm not mistaken,
>> are not handled by "rsync" unless you specify "-H") and in some cases spare
> > files can also explain the differences.

> My rsync is using -AHS, so this should not be relevant here.

> Uli

> --
> Science + Computing AG
> Vorstandsvorsitzender/Chairman of the board of management:
> Dr. Martin Matzke
> Vorstand/Board of Management:
> Matthias Schempp, Sabine Hohenstein
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Philippe Miltin
> Aufsichtsrat/Supervisory Board:
> Martin Wibbe, Ursula Morgenstern
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> [ http://gpfsug.org/mailman/listinfo/gpfsug-discuss |
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss ]

> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0003.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0004.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0005.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0006.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0007.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0008.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0009.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0010.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0011.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0012.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0013.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment-0014.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1711 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210602/218fbba7/attachment.bin>

From u.sibiller at science-computing.de  Wed Jun  2 16:56:25 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 2 Jun 2021 17:56:25 +0200
Subject: [gpfsug-discuss] du --apparent-size and quota
In-Reply-To: <OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>
References: <4745fcd1-0cc0-ff45-686f-45e0f3f7a118@science-computing.de>
	<33158ffa-eb3d-99ea-727c-d461188e8fe9@cc.in2p3.fr>
	<SJ0PR18MB4011A7DDD9175DD9FB8F183ABB3E9@SJ0PR18MB4011.namprd18.prod.outlook.com>
	<9358df69-a67e-9fdc-b8c8-01809d143eb7@science-computing.de>
	<OF5333B637.B58C28B3-ON852586E8.004D7AE4-482586E8.004E151C@ibm.com>
Message-ID: <2ca3f69c-ae50-1bc4-4dd2-58e42f983105@science-computing.de>

On 6/2/21 4:12 PM, IBM Spectrum Scale wrote:
> The data and metadata replications are 2 on both source and destination filesystems, so from:
> 
> $ mmrepquota -j srcfilesys | grep fileset
> srcfileset FILESET ? ? ? ? 800 ? ? ? ?800 ? ? ? ?800 ? ? ? ? ?0 ? ? none | ? ? ?863 ? ? ? 0 ? ? ? ?0
>  ? ? ? ?0 ? ? none
> 
> $ mmrepquota -j dstfilesys | grep fileset
> fileset root ? ? ? FILESET ? ? ? ? 457 ? ? ? ?400 ? ? ? ?400 ? ? ? ? ?0 ? ? none | ? ? ?853 ? ? ? 0
>  ? ? ? 0 ? ? ? ?0 ? ? none
> 
> the quota data should be changed from 800G to 457G (or 400G to 228.5G), after "rsync -AHS".

Why?

Did you notice that on the dstfilesys we have

    ignoreReplicationOnStatfs yes
    IgnoreReplicaSpaceOnStat yes
    ignoreReplicationForQuota yes

while the srcfilesys has

    ignoreReplicaSpaceOnStat 0
    ignoreReplicationForQuota 0
    ignoreReplicationOnStatfs 0
?

Changing the quota limit to 457 on the dstfilesys will surely help for the user but I still would 
like to understand why that happens? Losing > 10% of space when migrating to a newer filesystem is 
not something you'd expect. dstfilesys is ~6PB, so this means we lose more than 600TB, which is a 
serious issue I'd like to understand in detail (and maybe take countermeasures).

> Do you have sparse files on the first filesystem? Since the second filesystem
> has a larger blocksize than the first one, the copied file may not be sparse on the
> second filesystem. I think gpfs only supports holes that line up will a full filesystem
> block.

Maybe that's an issue, but I
a) use rsync -S so I guess the sparse files will be handled in the most compatible way
b) have no idea how to check this reliably


> mmrepquota reports without the --block-size parameter the size in units of 1KiB, so (if no ill-advised copy-paste editing confuses us) we are not talking about 400GiB but 400KiB.
> With just 863 files (from the inode part of the repquota output) and therefore 0.5KiB/file on average that could be explained by the sub-block size(although many files should vanish in the inodes).
> If it's 400GiB in 863 files with 500MiB/File the subblock overhead would not matter at all!

Upps, you are right in assuming a copy-and-paste accident, I had called mmrepquota with --block-size 
G. So the values we are talking about are really GiB, not KiB.

Uli


-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From jonathan.buzzard at strath.ac.uk  Fri Jun  4 10:12:15 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Fri, 4 Jun 2021 10:12:15 +0100
Subject: [gpfsug-discuss] CVE-2021-29740
In-Reply-To: <CAKV+Wqf2mWH_y7K4djGdOwebm-8Ece_EYZ2bPMt3AB4thV2u7Q@mail.gmail.com>
References: <CAKV+Wqf2mWH_y7K4djGdOwebm-8Ece_EYZ2bPMt3AB4thV2u7Q@mail.gmail.com>
Message-ID: <6aae1c6e-d46b-2fdc-daa6-be8d92882cb4@strath.ac.uk>

On 01/06/2021 17:48, Damir Krstic wrote:

> IBM posted a security bulletin for the spectrum?scale (CVE-2021-29740). 
> Not a lot of detail provided in that bulletin. Has anyone installed this 
> fix? Does anyone have more information about it?
> 

Anyone know how quickly Lenovo are at putting up security fixes like this?

Two days in and there is still nothing to download, which in the current 
security threat environment we are all operating in is bordering on 
unacceptable.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From leonardo.sala at psi.ch  Mon Jun  7 13:46:57 2021
From: leonardo.sala at psi.ch (Leonardo Sala)
Date: Mon, 7 Jun 2021 14:46:57 +0200
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
	infiniband cluster
Message-ID: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>

Hallo,

we do have multiple bare-metal GPFS clusters with infiniband fabric, and 
I am actually considering adding some VMs in the mix, to perform admin 
tasks (so that the bare metal servers do not need passwordless ssh keys) 
and quorum nodes. Has anybody tried this? What could be the drawbacks / 
issues at GPFS level?

Thanks a lot for the insights!

cheers

leo

-- 
Paul Scherrer Institut
Dr. Leonardo Sala
Group Leader High Performance Computing
Deputy Section Head Science IT
Science IT
WHGA/036
Forschungstrasse 111
5232 Villigen PSI
Switzerland

Phone: +41 56 310 3369
leonardo.sala at psi.ch
www.psi.ch

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210607/8c7d5721/attachment.htm>

From jjdoherty at yahoo.com  Mon Jun  7 14:28:51 2021
From: jjdoherty at yahoo.com (Jim Doherty)
Date: Mon, 7 Jun 2021 13:28:51 +0000 (UTC)
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a
	GPFS	infiniband cluster
In-Reply-To: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
Message-ID: <468451058.2156544.1623072531179@mail.yahoo.com>

 Hello,?
I have seen people do this to move manager node traffic off of the NSD servers,? it is one way to help scale the cluster as the manager RPC traffic doesn't need to contend with the NSD servers for bandwidth.? ? ?If you want the nodes to be able to participate in disk maintenance? (mmfsck,? ? mmrestripefs)? make sure they have enough pagepool? as a small pagepool could impact the performance of these operations.??
Jim Doherty?    On Monday, June 7, 2021, 08:55:49 AM EDT, Leonardo Sala <leonardo.sala at psi.ch> wrote:  
 
   
Hallo,
 
we do have multiple bare-metal GPFS clusters with infiniband fabric, and I am actually considering adding some VMs in the mix, to perform admin tasks (so that the bare metal servers do not need passwordless ssh keys) and quorum nodes. Has anybody tried this? What could be the drawbacks / issues at GPFS level?
 
Thanks a lot for the insights!
 
cheers
 
leo
 
 -- 
Paul Scherrer Institut
Dr. Leonardo Sala
Group Leader High Performance Computing
Deputy Section Head Science IT
Science IT
WHGA/036
Forschungstrasse 111
5232 Villigen PSI
Switzerland

Phone: +41 56 310 3369
leonardo.sala at psi.ch
www.psi.ch _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210607/86278f26/attachment.htm>

From Paul.Sanchez at deshaw.com  Mon Jun  7 14:36:00 2021
From: Paul.Sanchez at deshaw.com (Sanchez, Paul)
Date: Mon, 7 Jun 2021 13:36:00 +0000
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a
	GPFS	infiniband cluster
In-Reply-To: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
Message-ID: <e2a9f89e71cd4eefba11ef84d9609f99@deshaw.com>

Hi Leo,

We use VMs for Spectrum Scale all of the time (including VM-based NAS clusters that span multiple sites) and all of the cloud-based offerings do as well, so it?s pretty clearly a thing that people are using.  (Note: all of my experience is on Ethernet fabrics, so keep that in mind when I?m discussing networking.) But you?re right that there are a few pitfalls, such as?


1.       Licensing. The traditional PVU license model discouraged adding machines to clusters and encouraged the concentration of server roles in a way that didn?t align with best practices. If you?re on capacity based licensing then this issue is moot.  (We?ve been in that model for ages, and so consequently we have years of experience with GPFS and VMs. But with PVUs we probably wouldn?t have gone this way.)

2.       Virtualized networking can be flaky. In particular, I?ve found SR-IOV to be unreliable.  Suddenly in the middle of a TCP session you might see GPFS complain about ?Unexpected data in message. Header dump: cccccccc cccc cccc?? from a VM whose virtual network interface has gone awry and necessitates a reboot, and which can leave corrupted data on disk when this happens, requiring you to offline mmfsck and/or spelunk through a damaged filesystem and backups to recover.  Based on this, I would recommend the following:

a.       Do NOT use SR-IOV. If you?re using KVM then just stick with virtio (vnet and bridge interfaces).

b.       DO enable all of the checksum protection you can get on the cluster (e.g. nsdCksumTraditional=yes). This can act as a backstop against network reliability issues and in practice on modern machines doesn?t appear to be as big of a performance hit as it once was. (I?d recommend this for everyone honestly.)

c.       Think about increasing your replication factor if you?re running filesystems with only one copy of data/metadata.  One of the strengths of GPFS is its support for replication, both as a throughput scaling mechanism and for redundancy, and that redundancy can buy you a lot of forgiveness if things go wrong.

3.       Sizing.  Do not be too stingy with RAM and CPU allocations for your guest nodes. Scale is excellent at multithreading for things like parallel inode scan, prefetching, etc, and remember that your quorum nodes will be token managers by default unless you assign the manager roles elsewhere, and may need to have enough RAM to support their share of the token serving workload.  A stable cluster is one in which the servers aren?t thrashing for a lack of resources.

Others may have additional experience and best practices to share, which would be great since I don?t see this trend going away any time soon.

Good luck,
Paul

From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Leonardo Sala
Sent: Monday, June 7, 2021 08:47
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster


This message was sent by an external party.


Hallo,

we do have multiple bare-metal GPFS clusters with infiniband fabric, and I am actually considering adding some VMs in the mix, to perform admin tasks (so that the bare metal servers do not need passwordless ssh keys) and quorum nodes. Has anybody tried this? What could be the drawbacks / issues at GPFS level?

Thanks a lot for the insights!

cheers

leo

--

Paul Scherrer Institut

Dr. Leonardo Sala

Group Leader High Performance Computing

Deputy Section Head Science IT

Science IT

WHGA/036

Forschungstrasse 111

5232 Villigen PSI

Switzerland


Phone: +41 56 310 3369

leonardo.sala at psi.ch<mailto:leonardo.sala at psi.ch>

www.psi.ch<http://www.psi.ch>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210607/f062b14a/attachment.htm>

From wallyd at us.ibm.com  Mon Jun  7 16:03:13 2021
From: wallyd at us.ibm.com (Wally Dietrich)
Date: Mon, 7 Jun 2021 15:03:13 +0000
Subject: [gpfsug-discuss] DB2 (not DB2 PureScale) and Spectrum Scale
Message-ID: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>

Hi. Is there documentation about tuning DB2 to perform well when using Spectrum Scale file systems? I'm interested in tuning both DB2 and Spectrum Scale for high performance. I'm using a stretch cluster for Disaster Recover (DR). I've found a document, but the last update was in 2013 and GPFS has changed considerably since then.

Wally Dietrich
wallyd at us.ibm.com<mailto:wallyd at us.im.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210607/409b6061/attachment.htm>

From jonathan.buzzard at strath.ac.uk  Mon Jun  7 16:24:12 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 7 Jun 2021 16:24:12 +0100
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
Message-ID: <f0b24bf6-bb32-65d1-3f20-273b22790b80@strath.ac.uk>

On 07/06/2021 13:46, Leonardo Sala wrote:
> 
> Hallo,
> 
> we do have multiple bare-metal GPFS clusters with infiniband fabric, and 
> I am actually considering adding some VMs in the mix, to perform admin 
> tasks (so that the bare metal servers do not need passwordless ssh keys) 
> and quorum nodes. Has anybody tried this? What could be the drawbacks / 
> issues at GPFS level?
> 

Unless you are doing some sort of pass through of Infiniband adapters to 
the VM's you will need to create an Infiniband/Ethernet router.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From janfrode at tanso.net  Mon Jun  7 20:49:02 2021
From: janfrode at tanso.net (Jan-Frode Myklebust)
Date: Mon, 7 Jun 2021 21:49:02 +0200
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
Message-ID: <CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>

I?ve done this a few times. Once with IPoIB as daemon network, and then
created a separate routed network on the hypervisor to bridge (?) between
VM and IPoIB network.

Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor:
????????

To give the VMs access to the daemon network, we need create an internal
network for the VMs, that is then routed into the IPoIB network on the
hypervisor.

~~~
# cat <<EOF > routed34.xml
<network>
  <name>routed34</name>
  <forward mode='route' dev='bond0'/>
  <bridge name='virbr34' stp='on' delay='2'/>
  <ip address='10.0.0.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='10.0.0.128' end='10.0.0.254'/>
    </dhcp>
  </ip>
</network>
EOF
# virsh net-define routed34.xml
Network routed34 defined from routed34.xml

# virsh net-start routed34
Network routed34 started

# virsh net-autostart routed34
Network routed34 marked as autostarted

# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 routed34           active     yes           yes

~~~

????????-


I see no issue with it ? but beware that the FAQ lists some required
tunings if the VM is to host desconly disks (paniconiohang?)?


  -jf


man. 7. jun. 2021 kl. 14:55 skrev Leonardo Sala <leonardo.sala at psi.ch>:

> Hallo,
>
> we do have multiple bare-metal GPFS clusters with infiniband fabric, and I
> am actually considering adding some VMs in the mix, to perform admin tasks
> (so that the bare metal servers do not need passwordless ssh keys) and
> quorum nodes. Has anybody tried this? What could be the drawbacks / issues
> at GPFS level?
>
> Thanks a lot for the insights!
>
> cheers
>
> leo
>
> --
> Paul Scherrer Institut
> Dr. Leonardo Sala
> Group Leader High Performance Computing
> Deputy Section Head Science IT
> Science IT
> WHGA/036
> Forschungstrasse 111
> 5232 Villigen PSI
> Switzerland
>
> Phone: +41 56 310 3369leonardo.sala at psi.chwww.psi.ch
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210607/01df8d83/attachment.htm>

From heinrich.billich at id.ethz.ch  Tue Jun  8 10:04:19 2021
From: heinrich.billich at id.ethz.ch (Billich  Heinrich Rainer (ID SD))
Date: Tue, 8 Jun 2021 09:04:19 +0000
Subject: [gpfsug-discuss] Metadata usage almost doubled after policy run
	with migration rule
Message-ID: <139C805E-7D5D-4975-979C-58DF97061380@id.ethz.ch>

 
Hello,

 
A policy run with ?-I defer? and a placement rule did almost double the metadata usage of a filesystem. This filled the metadata disks to a critical level. I would like to understand if this is to be expected and ?as designed? or if I face some issue or bug. 

?I hope a subsequent run of ?mmrstripefs -p? will reduce the metadata usage again.

Thank you

 
I want to move all data to a new storage pool and did run a policy like

 
RULE 'migrate_to_Data'

? MIGRATE

??? WEIGHT(0)

? TO POOL 'Data'

 
 for each fileset with 

 
? mmapplypolicy -I defer

 
Next I want to actually move the data with 

 
? mmrestripefs -p

 
After the policy run metadata usage increased from 2.06TiB to 3.53TiB and filled the available metadata space by >90%. This is somewhat surprising. Will the following ?run of ?mmrestripefs -p? reduce the usage again, when the files are not illplaced any more? The number of used Inodes did not change noticeably during the policy run. 

 
Or maybe illplaced files use larger inodes? Looks like for each used inode we increased by about 4k: 400M inodes, 1.6T increase in size

 
Thank you,

 
Heiner

 
Some details

 
# mmlsfs? fsxxxx -f -i -B -I -m -M -r -R -V

flag??????????????? value??????????????????? description

------------------- ------------------------ -----------------------------------

 -f???????????????? 8192???????????????????? Minimum fragment (subblock) size in bytes (system pool)

??????????????????? 32768??????????????????? Minimum fragment (subblock) size in bytes (other pools)

 -i???????????????? 4096???????????????????? Inode size in bytes

 -B???????????????? 1048576????????????????? Block size (system pool)

??????????????????? 4194304????????????????? Block size (other pools)

 -I???????????????? 32768??????????????????? Indirect block size in bytes

 -m???????????????? 1??????????????????????? Default number of metadata replicas

 -M???????????????? 2??????????????????????? Maximum number of metadata replicas

 -r???????????????? 1?????????????????????? ?Default number of data replicas

 -R???????????????? 2??????????????????????? Maximum number of data replicas

 -V???????????????? 23.00 (5.0.5.0)????????? Current file system version

??????????????????? 19.01 (5.0.1.0)????????? Original file system version

 
Inode Information

-----------------

Total number of used inodes in all Inode spaces:????????? 398502837

Total number of free inodes in all Inode spaces:?????????? 94184267

Total number of allocated inodes in all Inode spaces:???? 492687104

Total of Maximum number of inodes in all Inode spaces:??? 916122880

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210608/086c1199/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5254 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210608/086c1199/attachment.bin>

From scale at us.ibm.com  Wed Jun  9 13:54:36 2021
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 9 Jun 2021 20:54:36 +0800
Subject: [gpfsug-discuss]
 =?utf-8?q?Metadata_usage_almost_doubled_after_po?=
 =?utf-8?q?licy_run=09with_migration_rule?=
In-Reply-To: <139C805E-7D5D-4975-979C-58DF97061380@id.ethz.ch>
References: <139C805E-7D5D-4975-979C-58DF97061380@id.ethz.ch>
Message-ID: <OF26804783.7D4FB25F-ON852586EF.004451AC-482586EF.0046EAF5@notes.na.collabserv.com>

Hi Billich,

>Or maybe illplaced files use larger inodes? Looks like for each used inode
we increased by about 4k: 400M inodes, 1.6T increase in size

Basically a migration policy run with -I defer would just simply mark the
files as illPlaced which would not cause metadata extension for such files
(e.g., inode size is fixed after file system creation). Instead, I'm just
wondering about your placement rules, which are existing rules or newly
installed rules? Which could set EAs to newly created files and may cause
increased metadata size. Also any new EAs are inserted for files?


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Billich  Heinrich Rainer (ID SD)"
            <heinrich.billich at id.ethz.ch>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	2021/06/08 05:19 PM
Subject:	[EXTERNAL] [gpfsug-discuss] Metadata usage almost doubled after
            policy run	with migration rule
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


 Hello,

 A policy run with ?-I defer? and a placement rule did almost double the
 metadata usage of a filesystem. This filled the metadata disks to a
 critical level. I would like to understand if this is to be expected and
 ?as designed? or if I face some issue or bug.
 ?I hope a subsequent run of ?mmrstripefs -p? will reduce the metadata
 usage again.
 Thank you


 I want to move all data to a new storage pool and did run a policy like

   RULE 'migrate_to_Data'
   ? MIGRATE
   ??? WEIGHT(0)
 ? TO POOL 'Data'

 for each fileset with

 ? mmapplypolicy -I defer

 Next I want to actually move the data with

 ? mmrestripefs -p

 After the policy run metadata usage increased from 2.06TiB to 3.53TiB and
 filled the available metadata space by >90%. This is somewhat surprising.
 Will the following ?run of ?mmrestripefs -p? reduce the usage again, when
 the files are not illplaced any more? The number of used Inodes did not
 change noticeably during the policy run.

 Or maybe illplaced files use larger inodes? Looks like for each used inode
 we increased by about 4k: 400M inodes, 1.6T increase in size

 Thank you,

 Heiner

 Some details

   # mmlsfs? fsxxxx -f -i -B -I -m -M -r -R -V
   flag??????????????? value??????????????????? description
   ------------------- ------------------------
   -----------------------------------
   -f???????????????? 8192???????????????????? Minimum fragment (subblock)
   size in bytes (system pool)
   ??????????????????? 32768??????????????????? Minimum fragment (subblock)
   size in bytes (other pools)
   -i???????????????? 4096???????????????????? Inode size in bytes
   -B???????????????? 1048576????????????????? Block size (system pool)
   ??????????????????? 4194304????????????????? Block size (other pools)
   -I???????????????? 32768??????????????????? Indirect block size in bytes
   -m???????????????? 1??????????????????????? Default number of metadata
   replicas
   -M???????????????? 2??????????????????????? Maximum number of metadata
   replicas
   -r???????????????? 1?????????????????????? ?Default number of data
   replicas
   -R???????????????? 2??????????????????????? Maximum number of data
   replicas
   -V???????????????? 23.00 (5.0.5.0)????????? Current file system version
 ??????????????????? 19.01 (5.0.1.0)????????? Original file system version

   Inode Information
   -----------------
   Total number of used inodes in all Inode spaces:????????? 398502837
   Total number of free inodes in all Inode spaces:?????????? 94184267
   Total number of allocated inodes in all Inode spaces:???? 492687104
 Total of Maximum number of inodes in all Inode spaces:??? 916122880
 [attachment "smime.p7s" deleted by Hai Zhong HZ Zhou/China/IBM]
 _______________________________________________
 gpfsug-discuss mailing list
 gpfsug-discuss at spectrumscale.org
 http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210609/1a6361e9/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210609/1a6361e9/attachment.gif>

From jonathan.buzzard at strath.ac.uk  Wed Jun  9 21:28:07 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Wed, 9 Jun 2021 21:28:07 +0100
Subject: [gpfsug-discuss] GPFS systemd and gpfs.gplbin
Message-ID: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>


So you need to apply a kernel update and that means a new gpfs.gplbin 
:-( So after going around the houses with several different approaches 
on this I have finally settled on what IMHO is a most elegant method of 
ensuring the right gpfs.gplbin version is installed for the kernel that 
is running and thought I would share it.

This is assuming you don't like the look of the compile it option IBM 
introduced. You may well not want compilers installed on nodes for 
example, or you just think compiling the module on hundreds of nodes is 
suboptimal.

This exact version works for RHEL and it's derivatives. Modify for your 
preferred distribution. It also assumes you have a repository setup with 
the relevant gpfs.gplbin package.

The basics are to use the "ExecStartPre" option of a unit file in 
systemd. So because you don't want to be modifying the unit file 
provided by IBM something like the following

mkdir -p /etc/systemd/system/gpfs.service.d
echo -e "[Service]\nExecStartPre=-/usr/bin/yum --assumeyes install 
gpfs.gplbin-%v" >/etc/systemd/system/gpfs.service.d/install-module.conf
systemctl daemon-reload

How it works is that the %v is a special systemd variable which is the 
same as "uname -r". So before it attempts to start gpfs, it attempts to 
install the gpfs.gplbin RPM for the kernel you are running on. If 
already installed this is harmless and if it's not installed it gets 
installed.

How you set that up on your system is up to you, xCAT postscript, RPM 
package, or a configuration management solution all work. I have gone 
for a very minimal RPM I call gpfs.helper

We then abuse the queuing system on the HPC cluster to schedule a 
"admin" priority job that runs as soon as the node becomes free, which 
does a yum update and then restarts the node.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From scale at us.ibm.com  Thu Jun 10 11:29:13 2021
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Thu, 10 Jun 2021 18:29:13 +0800
Subject: [gpfsug-discuss] DB2 (not DB2 PureScale) and Spectrum Scale
In-Reply-To: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>
References: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>
Message-ID: <OFBB956ED9.A422C4CE-ON852586F0.0037C158-482586F0.00399B8A@notes.na.collabserv.com>


Hi Wally,

I don't see a dedicated document for DB2 from Scale document sets, however,
usually the workloads of database are doing direct I/O, so those
documentation sections in Scale for direct I/O should be good to review.
Here I have a list about tunings for direct I/O for your reference.

https://www.ibm.com/docs/zh/spectrum-scale/5.1.1?topic=fpo-configuration-tuning-database-workload
s

https://www.ibm.com/docs/zh/spectrum-scale/5.1.1?topic=applications-considerations-use-direct-io-o-direct

https://www.ibm.com/docs/zh/spectrum-scale/5.1.1?topic=mfs-using-direct-io-file-in-gpfs-file-system


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	Wally Dietrich <wallyd at us.ibm.com>
To:	"gpfsug-discuss at spectrumscale.org"
            <gpfsug-discuss at spectrumscale.org>
Date:	2021/06/07 11:03 PM
Subject:	[EXTERNAL] [gpfsug-discuss] DB2 (not DB2 PureScale) and
            Spectrum Scale
Sent by:	gpfsug-discuss-bounces at spectrumscale.org


Hi. Is there documentation about tuning DB2 to perform well when using
Spectrum Scale file systems? I'm interested in tuning both DB2 and Spectrum
Scale for high performance. I'm using a stretch cluster for Disaster
Recover (DR). I've found a document, but the last update was in 2013 and
GPFS has changed considerably since then.

Wally Dietrich
wallyd at us.ibm.com
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210610/2dc7f41a/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210610/2dc7f41a/attachment.gif>

From jjdoherty at yahoo.com  Thu Jun 10 14:42:18 2021
From: jjdoherty at yahoo.com (Jim Doherty)
Date: Thu, 10 Jun 2021 13:42:18 +0000 (UTC)
Subject: [gpfsug-discuss] DB2 (not DB2 PureScale) and Spectrum Scale
In-Reply-To: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>
References: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>
Message-ID: <1697721993.3166627.1623332538820@mail.yahoo.com>

 I think I found the document you are talking about. In general I believe most of it still applies. I can make the following comments on it about Spectrum Scale: 


1 - There was an effort to simplify Spectrum Scale tuning, and tuning of worker1Threads should be replaced by tuning workerThreads instead. Setting workerThreads, will auto-tune about 20 different Spectrum Scale configuration parameters (including worker1Threads) behind the scene. 

 
2 - The Spectrum Scale pagepool parameter defaults to 1Gig now, but the most important thing is to make sure that you can fit all the IO into the pagepool. So if you have 512 threads * 1 MB you will need 1/2 Gig just to do disk IO, but if you use 4MB that becomes 512 * 4 = 2Gig just for disk IO. I would recommend setting the pagepool to 2x the size of this if you are using direct IO so 1 Gig or 4 Gig for the example sizes I just mentioned. 


3 - One consideration that is important is sizing the initial DB2 database size correctly, and when the tablespace needs to grow, make sure it grows enough to avoid constantly increasing the tablespace.

The act of growing a database throws GPFS into buffered IO which can be slower than directIO. If you need the database to grow all the time, I would avoid using direct IO and use a larger GPFS pagepool to allow it cache data. Using directIO is the better solution.

Jim Doherty

    On Monday, June 7, 2021, 11:03:26 AM EDT, Wally Dietrich <wallyd at us.ibm.com> wrote:  
 
  Hi. Is there documentation about tuning DB2 to perform well when using Spectrum Scale file systems? I'm interested in tuning both DB2 and Spectrum Scale for high performance. I'm using a stretch cluster for Disaster Recover (DR). I've found a document, but the last update was in 2013 and GPFS has changed considerably since then.
Wally Dietrichwallyd at us.ibm.com
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210610/c3c328eb/attachment.htm>

From skylar2 at uw.edu  Thu Jun 10 14:47:33 2021
From: skylar2 at uw.edu (Skylar Thompson)
Date: Thu, 10 Jun 2021 06:47:33 -0700
Subject: [gpfsug-discuss] GPFS systemd and gpfs.gplbin
In-Reply-To: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>
References: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>
Message-ID: <20210610134733.lvfh2at7rtjoceuk@thargelion>

Thanks, Jonathan, I've been thinking about how to manage this as well and
like it more than version-locking the kernel.

On Wed, Jun 09, 2021 at 09:28:07PM +0100, Jonathan Buzzard wrote:
> 
> So you need to apply a kernel update and that means a new gpfs.gplbin :-( So
> after going around the houses with several different approaches on this I
> have finally settled on what IMHO is a most elegant method of ensuring the
> right gpfs.gplbin version is installed for the kernel that is running and
> thought I would share it.
> 
> This is assuming you don't like the look of the compile it option IBM
> introduced. You may well not want compilers installed on nodes for example,
> or you just think compiling the module on hundreds of nodes is suboptimal.
> 
> This exact version works for RHEL and it's derivatives. Modify for your
> preferred distribution. It also assumes you have a repository setup with the
> relevant gpfs.gplbin package.
> 
> The basics are to use the "ExecStartPre" option of a unit file in systemd.
> So because you don't want to be modifying the unit file provided by IBM
> something like the following
> 
> mkdir -p /etc/systemd/system/gpfs.service.d
> echo -e "[Service]\nExecStartPre=-/usr/bin/yum --assumeyes install
> gpfs.gplbin-%v" >/etc/systemd/system/gpfs.service.d/install-module.conf
> systemctl daemon-reload
> 
> How it works is that the %v is a special systemd variable which is the same
> as "uname -r". So before it attempts to start gpfs, it attempts to install
> the gpfs.gplbin RPM for the kernel you are running on. If already installed
> this is harmless and if it's not installed it gets installed.
> 
> How you set that up on your system is up to you, xCAT postscript, RPM
> package, or a configuration management solution all work. I have gone for a
> very minimal RPM I call gpfs.helper
> 
> We then abuse the queuing system on the HPC cluster to schedule a "admin"
> priority job that runs as soon as the node becomes free, which does a yum
> update and then restarts the node.
> 
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-- 
-- Skylar Thompson (skylar2 at u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His


From novosirj at rutgers.edu  Thu Jun 10 15:00:49 2021
From: novosirj at rutgers.edu (Ryan Novosielski)
Date: Thu, 10 Jun 2021 14:00:49 +0000
Subject: [gpfsug-discuss] GPFS systemd and gpfs.gplbin
In-Reply-To: <20210610134733.lvfh2at7rtjoceuk@thargelion>
References: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>,
	<20210610134733.lvfh2at7rtjoceuk@thargelion>
Message-ID: <8AABA7D0-C204-4CA0-ADDA-6BF29068FED2@rutgers.edu>

The problem with not version locking the kernel, however, is that you really need to know that the kernel you are going to is going to support the GPFS version that you are going to be running. Typically that only becomes a problem when you cross a minor release boundary on RHEL-derivatives, but I think not always. But I suppose you can just try this on something first just to make sure, or handle it at the repository level, or something else.

Sent from my iPhone

> On Jun 10, 2021, at 09:48, Skylar Thompson <skylar2 at uw.edu> wrote:
> 
> ?Thanks, Jonathan, I've been thinking about how to manage this as well and
> like it more than version-locking the kernel.
> 
>> On Wed, Jun 09, 2021 at 09:28:07PM +0100, Jonathan Buzzard wrote:
>> 
>> So you need to apply a kernel update and that means a new gpfs.gplbin :-( So
>> after going around the houses with several different approaches on this I
>> have finally settled on what IMHO is a most elegant method of ensuring the
>> right gpfs.gplbin version is installed for the kernel that is running and
>> thought I would share it.
>> 
>> This is assuming you don't like the look of the compile it option IBM
>> introduced. You may well not want compilers installed on nodes for example,
>> or you just think compiling the module on hundreds of nodes is suboptimal.
>> 
>> This exact version works for RHEL and it's derivatives. Modify for your
>> preferred distribution. It also assumes you have a repository setup with the
>> relevant gpfs.gplbin package.
>> 
>> The basics are to use the "ExecStartPre" option of a unit file in systemd.
>> So because you don't want to be modifying the unit file provided by IBM
>> something like the following
>> 
>> mkdir -p /etc/systemd/system/gpfs.service.d
>> echo -e "[Service]\nExecStartPre=-/usr/bin/yum --assumeyes install
>> gpfs.gplbin-%v" >/etc/systemd/system/gpfs.service.d/install-module.conf
>> systemctl daemon-reload
>> 
>> How it works is that the %v is a special systemd variable which is the same
>> as "uname -r". So before it attempts to start gpfs, it attempts to install
>> the gpfs.gplbin RPM for the kernel you are running on. If already installed
>> this is harmless and if it's not installed it gets installed.
>> 
>> How you set that up on your system is up to you, xCAT postscript, RPM
>> package, or a configuration management solution all work. I have gone for a
>> very minimal RPM I call gpfs.helper
>> 
>> We then abuse the queuing system on the HPC cluster to schedule a "admin"
>> priority job that runs as soon as the node becomes free, which does a yum
>> update and then restarts the node.
>> 
>> 
>> JAB.
>> 
>> -- 
>> Jonathan A. Buzzard                         Tel: +44141-5483420
>> HPC System Administrator, ARCHIE-WeSt.
>> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> -- 
> -- Skylar Thompson (skylar2 at u.washington.edu)
> -- Genome Sciences Department (UW Medicine), System Administrator
> -- Foege Building S046, (206)-685-7354
> -- Pronouns: He/Him/His
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

From S.J.Thompson at bham.ac.uk  Thu Jun 10 17:13:46 2021
From: S.J.Thompson at bham.ac.uk (Simon Thompson)
Date: Thu, 10 Jun 2021 16:13:46 +0000
Subject: [gpfsug-discuss] GPFS systemd and gpfs.gplbin
In-Reply-To: <8AABA7D0-C204-4CA0-ADDA-6BF29068FED2@rutgers.edu>
References: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>
	<20210610134733.lvfh2at7rtjoceuk@thargelion>
	<8AABA7D0-C204-4CA0-ADDA-6BF29068FED2@rutgers.edu>
Message-ID: <D0426991-D606-4396-A743-055250261DCD@bham.ac.uk>

We manage kernel updates pretty carefully ... not least because there is a good chance MOFED will also break at the same time.

We do have a similar systemd unit that tries to install from our local repos, then tries to build locally.

Simon

?On 10/06/2021, 15:01, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Ryan Novosielski" <gpfsug-discuss-bounces at spectrumscale.org on behalf of novosirj at rutgers.edu> wrote:

    The problem with not version locking the kernel, however, is that you really need to know that the kernel you are going to is going to support the GPFS version that you are going to be running. Typically that only becomes a problem when you cross a minor release boundary on RHEL-derivatives, but I think not always. But I suppose you can just try this on something first just to make sure, or handle it at the repository level, or something else.

    Sent from my iPhone

    > On Jun 10, 2021, at 09:48, Skylar Thompson <skylar2 at uw.edu> wrote:
    > 
    > Thanks, Jonathan, I've been thinking about how to manage this as well and
    > like it more than version-locking the kernel.
    > 
    >> On Wed, Jun 09, 2021 at 09:28:07PM +0100, Jonathan Buzzard wrote:
    >> 
    >> So you need to apply a kernel update and that means a new gpfs.gplbin :-( So
    >> after going around the houses with several different approaches on this I
    >> have finally settled on what IMHO is a most elegant method of ensuring the
    >> right gpfs.gplbin version is installed for the kernel that is running and
    >> thought I would share it.
    >> 
    >> This is assuming you don't like the look of the compile it option IBM
    >> introduced. You may well not want compilers installed on nodes for example,
    >> or you just think compiling the module on hundreds of nodes is suboptimal.
    >> 
    >> This exact version works for RHEL and it's derivatives. Modify for your
    >> preferred distribution. It also assumes you have a repository setup with the
    >> relevant gpfs.gplbin package.
    >> 
    >> The basics are to use the "ExecStartPre" option of a unit file in systemd.
    >> So because you don't want to be modifying the unit file provided by IBM
    >> something like the following
    >> 
    >> mkdir -p /etc/systemd/system/gpfs.service.d
    >> echo -e "[Service]\nExecStartPre=-/usr/bin/yum --assumeyes install
    >> gpfs.gplbin-%v" >/etc/systemd/system/gpfs.service.d/install-module.conf
    >> systemctl daemon-reload
    >> 
    >> How it works is that the %v is a special systemd variable which is the same
    >> as "uname -r". So before it attempts to start gpfs, it attempts to install
    >> the gpfs.gplbin RPM for the kernel you are running on. If already installed
    >> this is harmless and if it's not installed it gets installed.
    >> 
    >> How you set that up on your system is up to you, xCAT postscript, RPM
    >> package, or a configuration management solution all work. I have gone for a
    >> very minimal RPM I call gpfs.helper
    >> 
    >> We then abuse the queuing system on the HPC cluster to schedule a "admin"
    >> priority job that runs as soon as the node becomes free, which does a yum
    >> update and then restarts the node.
    >> 
    >> 
    >> JAB.
    >> 
    >> -- 
    >> Jonathan A. Buzzard                         Tel: +44141-5483420
    >> HPC System Administrator, ARCHIE-WeSt.
    >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
    >> _______________________________________________
    >> gpfsug-discuss mailing list
    >> gpfsug-discuss at spectrumscale.org
    >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    > 
    > -- 
    > -- Skylar Thompson (skylar2 at u.washington.edu)
    > -- Genome Sciences Department (UW Medicine), System Administrator
    > -- Foege Building S046, (206)-685-7354
    > -- Pronouns: He/Him/His
    > _______________________________________________
    > gpfsug-discuss mailing list
    > gpfsug-discuss at spectrumscale.org
    > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss


From jonathan.buzzard at strath.ac.uk  Thu Jun 10 19:08:47 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Thu, 10 Jun 2021 19:08:47 +0100
Subject: [gpfsug-discuss] GPFS systemd and gpfs.gplbin
In-Reply-To: <8AABA7D0-C204-4CA0-ADDA-6BF29068FED2@rutgers.edu>
References: <59a55182-efd4-348b-76c9-52152e21a097@strath.ac.uk>
	<20210610134733.lvfh2at7rtjoceuk@thargelion>
	<8AABA7D0-C204-4CA0-ADDA-6BF29068FED2@rutgers.edu>
Message-ID: <f7876464-9f1f-9c94-bb7c-1cc52edc4db6@strath.ac.uk>

On 10/06/2021 15:00, Ryan Novosielski wrote:>
> The problem with not version locking the kernel, however, is that you
> really need to know that the kernel you are going to is going to
> support the GPFS version that you are going to be running. Typically
> that only becomes a problem when you cross a minor release boundary
> on RHEL-derivatives, but I think not always. But I suppose you can
> just try this on something first just to make sure, or handle it at
> the repository level, or something else.
> 

Well *everything* comes from a local repo mirror for all the GPFS nodes 
so I can control what goes in and when. I use a VM for building the 
gpfs.gplbin in advance and then test it on a single node before the main 
roll out.

I would note that I read the actual release notes and then make a 
judgment on whether the kernel update actually makes it to my local 
mirror. It could be a just a bug fix, or the security issue might for 
example be in a driver which is not relevant to the platform I am 
managing. WiFi and Bluetooth drivers are examples from the past.

The issue I found is you do a "yum update" and new kernel gets pulled 
in, and/or a new GPFS version. However the matching gpfs.gplbin is now 
missing and I wanted an automated process of insuring the right version 
of gpfs.gplbin is installed for whatever kernel happens to be running. 
Noting that this could also be at install time, which partly why I went 
with the gpfs.helper RPM.

JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From luis.bolinches at fi.ibm.com  Thu Jun 10 21:35:43 2021
From: luis.bolinches at fi.ibm.com (Luis Bolinches)
Date: Thu, 10 Jun 2021 20:35:43 +0000
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>
References: <CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>,
	<c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
Message-ID: <OF79B14E8B.3262C913-ON002586F0.007101E4-002586F0.00712236@ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210610/c57c8772/attachment.htm>

From Achim.Rehor at de.ibm.com  Fri Jun 11 16:26:43 2021
From: Achim.Rehor at de.ibm.com (Achim Rehor)
Date: Fri, 11 Jun 2021 17:26:43 +0200
Subject: [gpfsug-discuss] DB2 (not DB2 PureScale) and Spectrum Scale
In-Reply-To: <1697721993.3166627.1623332538820@mail.yahoo.com>
References: <9961B754-EA42-4B2C-AA6D-3CAE5F4805AE@us.ibm.com>
	<1697721993.3166627.1623332538820@mail.yahoo.com>
Message-ID: <OF6A94B79F.A6344225-ONC12586F1.00538777-C12586F1.0054D811@notes.na.collabserv.com>

one additional noticable change, that comes in Spectrum Scale 5.0.4.2+ and 
 is an enhancement to what Jim just touched below.

Direct IO of databases is often doing small IO into huge files. Even with 
very fast backend, the amount of IOs doing 4k or 64k IOs limits the 
bandwidth because of the sheer amount of IO.
Having seen this issue, we added a feature to Spectrum Scale, that batches 
small IO per timeslot, in order to lessen the number of IO against the 
backend, and thus improving write performance.

the new feature is tuned by the 
  dioSmallSeqWriteBatching = yes[no] 
and will batch all smaller IO, that is 
  dioSmallSeqWriteThreshold = [65536]
or smaller in size , and dump it to disk avery
  aioSyncDelay = 10 (usec).

That is, if the system recognizes 3 or more small Direct IOs and 
dioSmallSeqWriteThreshold is set, it will gather all these IOs within 
aioSyncDelay and do just one IO (per FS Blocksize) instead of hundreds of 
small IOs. 
For certain use cases this can dramatically improve performance. 

see 
https://www.spectrumscaleug.org/wp-content/uploads/2020/04/SSSD20DE-Spectrum-Scale-Performance-Enhancements-for-Direct-IO.pdf 
by Olaf Weiser

 
Mit freundlichen Gr??en / Kind regards

Achim Rehor
 
Remote Technical Support Engineer Storage 
IBM Systems Storage Support - EMEA Storage Competence Center (ESCC)
Spectrum Scale / Elastic Storage Server
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Am Weiher 24
65451 Kelsterbach
Phone: +49-170-4521194
E-Mail: Achim.Rehor at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Sebastian Krause
Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert 
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940 


gpfsug-discuss-bounces at spectrumscale.org wrote on 10/06/2021 15:42:18:

> From: Jim Doherty <jjdoherty at yahoo.com>
> To: "gpfsug-discuss at spectrumscale.org" 
<gpfsug-discuss at spectrumscale.org>
> Date: 10/06/2021 15:42
> Subject: [EXTERNAL] Re: [gpfsug-discuss] DB2 (not DB2 PureScale) and
> Spectrum Scale
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> 
> I think I found the document you are talking about. In general I 
> believe most of it still applies. I can make the following comments 
> on it about Spectrum Scale: 1 - There was an effort to simplify 
> Spectrum Scale tuning, and tuning of worker1Threads 
ZjQcmQRYFpfptBannerStart 
> This Message Is From an External Sender 
> This message came from outside your organization. 
> ZjQcmQRYFpfptBannerEnd
> I think I found the document you are talking about. In general I 
> believe most of it still applies. I can make the following comments 
> on it about Spectrum Scale: 
> 1 - There was an effort to simplify Spectrum Scale tuning, and 
> tuning of worker1Threads should be replaced by tuning workerThreads 
> instead. Setting workerThreads, will auto-tune about 20 different 
> Spectrum Scale configuration parameters (including worker1Threads) 
> behind the scene. 
> 2 - The Spectrum Scale pagepool parameter defaults to 1Gig now, but 
> the most important thing is to make sure that you can fit all the IO
> into the pagepool. So if you have 512 threads * 1 MB you will need 
> 1/2 Gig just to do disk IO, but if you use 4MB that becomes 512 * 4 
> = 2Gig just for disk IO. I would recommend setting the pagepool to 
> 2x the size of this if you are using direct IO so 1 Gig or 4 Gig for
> the example sizes I just mentioned. 
> 3 - One consideration that is important is sizing the initial DB2 
> database size correctly, and when the tablespace needs to grow, make
> sure it grows enough to avoid constantly increasing the tablespace.
> The act of growing a database throws GPFS into buffered IO which can
> be slower than directIO. If you need the database to grow all the 
> time, I would avoid using direct IO and use a larger GPFS pagepool 
> to allow it cache data. Using directIO is the better solution.
> 
> Jim Doherty
> 
> On Monday, June 7, 2021, 11:03:26 AM EDT, Wally Dietrich 
> <wallyd at us.ibm.com> wrote: 
> 
> Hi. Is there documentation about tuning DB2 to perform well when 
> using Spectrum Scale file systems? I'm interested in tuning both DB2
> and Spectrum Scale for high performance. I'm using a stretch cluster
> for Disaster Recover (DR). I've found a document, but the last 
> update was in 2013 and GPFS has changed considerably since then. 
> 
> Wally Dietrich
> wallyd at us.ibm.com
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> INVALID URI REMOVED
> 
u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-
> siA1ZOg&r=RGTETs2tk0Kz_VOpznDVDkqChhnfLapOTkxLvgmR2-
> 
M&m=0w6BrJDJDqZrylo3ICWwqF7uFCQ5smwrDGjZm8xpKjU&s=7CZY0jIPCvfodrfNQoZlx3N2Dh9n7m-5mQkP5zhzI-
> I&e= 


From leonardo.sala at psi.ch  Thu Jun 17 07:35:45 2021
From: leonardo.sala at psi.ch (Leonardo Sala)
Date: Thu, 17 Jun 2021 08:35:45 +0200
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
	<CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>
Message-ID: <e02e2866-4ccd-4e0b-ff03-ed36714b69f9@psi.ch>

Hallo everybody

thanks for the feedback! So, what it is suggested is to create on the VM 
(in my case hosted on vSphere, with only one NIC) a secondary IP within 
the IPoIP range, and create a route for that IP range to go over the 
public IP (and create a similar route on my bare-metal servers, so that 
the VM IPoIB IPs are reached over the public network) - is that correct?

The only other options would be to ditch IPoIB as daemon network, right? 
What happens if some nodes have access to the daemon network over IPoIB, 
and other not - GPFS goes back to public ip cluster wide, or else?

Thanks again!

regards

leo

Paul Scherrer Institut
Dr. Leonardo Sala
Group Leader High Performance Computing
Deputy Section Head Science IT
Science IT
WHGA/036
Forschungstrasse 111
5232 Villigen PSI
Switzerland

Phone: +41 56 310 3369
leonardo.sala at psi.ch
www.psi.ch

On 07.06.21 21:49, Jan-Frode Myklebust wrote:
>
> I?ve done this a few times. Once with IPoIB as daemon network, and 
> then created a separate routed network on the hypervisor to bridge (?) 
> between VM and IPoIB network.
>
> Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor:
> ????????
>
> To give the VMs access to the daemon network, we need create an 
> internal network for the VMs, that is then routed into the IPoIB 
> network on the hypervisor.
>
> ~~~
> # cat <<EOF > routed34.xml
> <network>
> <name>routed34</name>
> <forward mode='route' dev='bond0'/>
> <bridge name='virbr34' stp='on' delay='2'/>
> ? <ip address='10.0.0.1' netmask='255.255.255.0'>
> <dhcp>
> <range start='10.0.0.128' end='10.0.0.254'/>
> </dhcp>
> </ip>
> </network>
> EOF
> # virsh net-define routed34.xml
> Network routed34 defined from routed34.xml
>
> # virsh net-start routed34
> Network routed34 started
>
> # virsh net-autostart routed34
> Network routed34 marked as autostarted
>
> # virsh net-list --all
> ?Name ? ? ? ? ? ? State ? ? ?Autostart ? ? Persistent
> ----------------------------------------------------------
> ?default ? ? ? ? ? ?active ? ? yes ? ? ? ? ? yes
> ?routed34 ? ? ? ? ? active ? ? yes ? ? ? ? ? yes
>
> ~~~
>
> ????????-
>
>
> I see no issue with it ? but beware that the FAQ lists some required 
> tunings if the VM is to host desconly disks (paniconiohang?)?
>
>
>
> ? -jf
>
>
> man. 7. jun. 2021 kl. 14:55 skrev Leonardo Sala <leonardo.sala at psi.ch 
> <mailto:leonardo.sala at psi.ch>>:
>
>     Hallo,
>
>     we do have multiple bare-metal GPFS clusters with infiniband
>     fabric, and I am actually considering adding some VMs in the mix,
>     to perform admin tasks (so that the bare metal servers do not need
>     passwordless ssh keys) and quorum nodes. Has anybody tried this?
>     What could be the drawbacks / issues at GPFS level?
>
>     Thanks a lot for the insights!
>
>     cheers
>
>     leo
>
>     -- 
>     Paul Scherrer Institut
>     Dr. Leonardo Sala
>     Group Leader High Performance Computing
>     Deputy Section Head Science IT
>     Science IT
>     WHGA/036
>     Forschungstrasse 111
>     5232 Villigen PSI
>     Switzerland
>
>     Phone: +41 56 310 3369
>     leonardo.sala at psi.ch  <mailto:leonardo.sala at psi.ch>
>     www.psi.ch  <http://www.psi.ch>
>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210617/060288a1/attachment.htm>

From janfrode at tanso.net  Thu Jun 17 09:29:42 2021
From: janfrode at tanso.net (Jan-Frode Myklebust)
Date: Thu, 17 Jun 2021 10:29:42 +0200
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <e02e2866-4ccd-4e0b-ff03-ed36714b69f9@psi.ch>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
	<CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>
	<e02e2866-4ccd-4e0b-ff03-ed36714b69f9@psi.ch>
Message-ID: <CAHwPath31ZLU-Oq8LZiSxG5tZokJmPJOj0DLnJs6WFMFTrTS4g@mail.gmail.com>

*All* nodes needs to be able to communicate on the daemon network. If they
don't have access to this network, they can't join the cluster. It doesn't
need to be same subnet, it can be routed. But they all have to be able to
reach each other. If you use IPoIB, you likely need something to route
between the IPoIB network and the outside world to reach the IP you have on
your VM. I don't think you will be able to use an IP address in the IPoIB
range for your VM, unless your vmware hypervisor is connected to the IB
fabric, and can bridge it.. (doubt that's possible).

I've seen some customers avoid using IPoIB, and rather mix an ethernet for
daemon network, and dedicate the infiniband network to RDMA.

  -jf

On Thu, Jun 17, 2021 at 8:35 AM Leonardo Sala <leonardo.sala at psi.ch> wrote:

> Hallo everybody
>
> thanks for the feedback! So, what it is suggested is to create on the VM
> (in my case hosted on vSphere, with only one NIC) a secondary IP within the
> IPoIP range, and create a route for that IP range to go over the public IP
> (and create a similar route on my bare-metal servers, so that the VM IPoIB
> IPs are reached over the public network) - is that correct?
>
> The only other options would be to ditch IPoIB as daemon network, right?
> What happens if some nodes have access to the daemon network over IPoIB,
> and other not - GPFS goes back to public ip cluster wide, or else?
>
> Thanks again!
>
> regards
>
> leo
>
> Paul Scherrer Institut
> Dr. Leonardo Sala
> Group Leader High Performance Computing
> Deputy Section Head Science IT
> Science IT
> WHGA/036
> Forschungstrasse 111
> 5232 Villigen PSI
> Switzerland
>
> Phone: +41 56 310 3369leonardo.sala at psi.chwww.psi.ch
>
> On 07.06.21 21:49, Jan-Frode Myklebust wrote:
>
>
> I?ve done this a few times. Once with IPoIB as daemon network, and then
> created a separate routed network on the hypervisor to bridge (?) between
> VM and IPoIB network.
>
> Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor:
> ????????
>
> To give the VMs access to the daemon network, we need create an internal
> network for the VMs, that is then routed into the IPoIB network on the
> hypervisor.
>
> ~~~
> # cat <<EOF > routed34.xml
> <network>
>   <name>routed34</name>
>   <forward mode='route' dev='bond0'/>
>   <bridge name='virbr34' stp='on' delay='2'/>
>   <ip address='10.0.0.1' netmask='255.255.255.0'>
>     <dhcp>
>       <range start='10.0.0.128' end='10.0.0.254'/>
>     </dhcp>
>   </ip>
> </network>
> EOF
> # virsh net-define routed34.xml
> Network routed34 defined from routed34.xml
>
> # virsh net-start routed34
> Network routed34 started
>
> # virsh net-autostart routed34
> Network routed34 marked as autostarted
>
> # virsh net-list --all
>  Name                 State      Autostart     Persistent
> ----------------------------------------------------------
>  default              active     yes           yes
>  routed34           active     yes           yes
>
> ~~~
>
> ????????-
>
>
> I see no issue with it ? but beware that the FAQ lists some required
> tunings if the VM is to host desconly disks (paniconiohang?)?
>
>
>
>   -jf
>
>
> man. 7. jun. 2021 kl. 14:55 skrev Leonardo Sala <leonardo.sala at psi.ch>:
>
>> Hallo,
>>
>> we do have multiple bare-metal GPFS clusters with infiniband fabric, and
>> I am actually considering adding some VMs in the mix, to perform admin
>> tasks (so that the bare metal servers do not need passwordless ssh keys)
>> and quorum nodes. Has anybody tried this? What could be the drawbacks /
>> issues at GPFS level?
>>
>> Thanks a lot for the insights!
>>
>> cheers
>>
>> leo
>>
>> --
>> Paul Scherrer Institut
>> Dr. Leonardo Sala
>> Group Leader High Performance Computing
>> Deputy Section Head Science IT
>> Science IT
>> WHGA/036
>> Forschungstrasse 111
>> 5232 Villigen PSI
>> Switzerland
>>
>> Phone: +41 56 310 3369leonardo.sala at psi.chwww.psi.ch
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210617/67ed0a8c/attachment.htm>

From heinrich.billich at id.ethz.ch  Thu Jun 17 12:53:03 2021
From: heinrich.billich at id.ethz.ch (Billich  Heinrich Rainer (ID SD))
Date: Thu, 17 Jun 2021 11:53:03 +0000
Subject: [gpfsug-discuss] Metadata usage almost doubled after policy run
 with migration rule
In-Reply-To: <OF26804783.7D4FB25F-ON852586EF.004451AC-482586EF.0046EAF5@notes.na.collabserv.com>
References: <139C805E-7D5D-4975-979C-58DF97061380@id.ethz.ch>
	<OF26804783.7D4FB25F-ON852586EF.004451AC-482586EF.0046EAF5@notes.na.collabserv.com>
Message-ID: <853DB9D7-9A3D-494F-88E4-BF448903C13E@id.ethz.ch>

Hello,

 
Thank you for your response. I opened a case with IBM and what we found is ? as I understand:

 
If you change the storage pool of a file which has copy in a snapshot the ?inode is dublicated (copy on write) ? the data pool is part of the inode and its preserved in the snapshot, the snapshot get?s its own inode version. So even if the file?s blocks actually did move to storage pool B the snapshot still shows the previous storage pool A. Once the snapshots get deleted the additional metadata space is freed. Probably backup software does save the storage pool, too. Hence the snapshot must preserve the original value.

 
You can easily verify with mmlsattr that the snapshot version and the plain version show different storage pools.

 
I saw a bout 4500 bytes extra space required for each inode when I did run the migration rule which changed the storage pool.

 
Kind regards,

 
Heiner

 
From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of IBM Spectrum Scale <scale at us.ibm.com>
Reply to: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: Wednesday, 9 June 2021 at 14:55
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc: "gpfsug-discuss-bounces at spectrumscale.org" <gpfsug-discuss-bounces at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Metadata usage almost doubled after policy run with migration rule

 
Hi Billich,

>Or maybe illplaced files use larger inodes? Looks like for each used inode we increased by about 4k: 400M inodes, 1.6T increase in size

Basically a migration policy run with -I defer would just simply mark the files as illPlaced which would not cause metadata extension for such files(e.g., inode size is fixed after file system creation). Instead, I'm just wondering about your placement rules, which are existing rules or newly installed rules? Which could set EAs to newly created files and may cause increased metadata size. Also any new EAs are inserted for files?


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. 

If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. 

The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.

"Billich Heinrich Rainer (ID SD)" ---2021/06/08 05:19:32 PM--- Hello,

From: "Billich Heinrich Rainer (ID SD)" <heinrich.billich at id.ethz.ch>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 2021/06/08 05:19 PM
Subject: [EXTERNAL] [gpfsug-discuss] Metadata usage almost doubled after policy run with migration rule
Sent by: gpfsug-discuss-bounces at spectrumscale.org


Hello,

A policy run with ?-I defer? and a placement rule did almost double the metadata usage of a filesystem. This filled the metadata disks to a critical level. I would like to understand if this is to be expected and ?as designed? or if I face some issue or bug. 
 I hope a subsequent run of ?mmrstripefs -p? will reduce the metadata usage again.
Thank you


I want to move all data to a new storage pool and did run a policy like

RULE 'migrate_to_Data'
  MIGRATE
    WEIGHT(0)

  TO POOL 'Data'

for each fileset with 

  mmapplypolicy -I defer

Next I want to actually move the data with 

  mmrestripefs -p

After the policy run metadata usage increased from 2.06TiB to 3.53TiB and filled the available metadata space by >90%. This is somewhat surprising. Will the following  run of ?mmrestripefs -p? reduce the usage again, when the files are not illplaced any more? The number of used Inodes did not change noticeably during the policy run. 

Or maybe illplaced files use larger inodes? Looks like for each used inode we increased by about 4k: 400M inodes, 1.6T increase in size

Thank you,

Heiner

Some details

# mmlsfs  fsxxxx -f -i -B -I -m -M -r -R -V
flag                value                    description
------------------- ------------------------ -----------------------------------
-f                 8192                     Minimum fragment (subblock) size in bytes (system pool)
                    32768                    Minimum fragment (subblock) size in bytes (other pools)
-i                 4096                     Inode size in bytes
-B                 1048576                  Block size (system pool)
                    4194304                  Block size (other pools)
-I                 32768                    Indirect block size in bytes
-m                 1                        Default number of metadata replicas
-M                 2                        Maximum number of metadata replicas
-r                 1                        Default number of data replicas
-R                 2                        Maximum number of data replicas
-V                 23.00 (5.0.5.0)          Current file system version

                    19.01 (5.0.1.0)          Original file system version

Inode Information
-----------------
Total number of used inodes in all Inode spaces:          398502837
Total number of free inodes in all Inode spaces:           94184267
Total number of allocated inodes in all Inode spaces:     492687104

Total of Maximum number of inodes in all Inode spaces:    916122880[attachment "smime.p7s" deleted by Hai Zhong HZ Zhou/China/IBM] _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210617/086e6118/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 106 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210617/086e6118/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5254 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210617/086e6118/attachment.bin>

From jonathan.buzzard at strath.ac.uk  Thu Jun 17 13:15:32 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Thu, 17 Jun 2021 13:15:32 +0100
Subject: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS
 infiniband cluster
In-Reply-To: <CAHwPath31ZLU-Oq8LZiSxG5tZokJmPJOj0DLnJs6WFMFTrTS4g@mail.gmail.com>
References: <c342dfbc-866a-d6e2-6b83-7623dc82cc70@psi.ch>
	<CAHwPatgiakJxggNeE5n5acd6OfkghpRwzhCurmg=CaorGmyA3w@mail.gmail.com>
	<e02e2866-4ccd-4e0b-ff03-ed36714b69f9@psi.ch>
	<CAHwPath31ZLU-Oq8LZiSxG5tZokJmPJOj0DLnJs6WFMFTrTS4g@mail.gmail.com>
Message-ID: <8fae4157-049b-e23c-0d69-c07be77d1f5b@strath.ac.uk>

On 17/06/2021 09:29, Jan-Frode Myklebust wrote:

> *All* nodes needs to be able to communicate on the daemon network. If 
> they don't have access to this network, they can't join the cluster.

Not strictly true.

TL;DR if all your NSD/master nodes are both Ethernet and Infiniband 
connected then you will be able to join the node to the network. Doing 
so is not advisable however as you will then start experiencing node 
evictions left right and centre.

> It doesn't need to be same subnet, it can be routed. But they all have to 
> be able to reach each other. If you use IPoIB, you likely need something 
> to route between the IPoIB network and the outside?world to reach the IP 
> you have on your VM. I don't think you will be able to use an IP address 
> in the IPoIB range for your VM, unless your vmware hypervisor is 
> connected to the IB fabric, and can bridge it.. (doubt that's possible).

ESXi and pretty much ever other hypervisor worth their salt has been 
able to do PCI pass through since forever. So wack a Infiniband card in 
your ESXi node, pass it through to the VM and the jobs a goodun.

However it is something a lot of people are completely unaware of, 
including Infiniband/Omnipath vendors. Conversation goes can I run my 
fabric manager on a VM in ESXi rather than burn the planet on dedicated 
nodes for the job. Response comes back the fabric is not supported on 
ESXi, which shows utter ignorance on behalf of the fabric vendor.

> I've seen some customers avoid using IPoIB, and rather mix an ethernet 
> for daemon network, and dedicate the infiniband network to RDMA.
> 
What's the point of RDMA for GPFS, lower CPU overhead? For my mind it 
creates a lot of inflexibility. If your next cluster uses a different 
fabric migration is now a whole bunch more complicated. It's also a 
"minority sport" so something to be avoided unless there is a compelling 
reason not to.

In general you need a machine to act as a gateway between the Ethernet 
and Infiniband fabrics. The configuration for this is minimal, the 
following works just fine on RHEL7 and it's derivatives, though you will 
need to change your interface names to suite

enable the kernel to forward IPv4 packets

sysctl -w net.ipv4.ip_forward=1
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf

tell the firewall to forward packets between the Ethernet and Infiniband
interfaces

iptables -A FORWARD -i eth0 -o ib0 -j ACCEPT
iptables -A FORWARD -i ib0 -o eth0 -j ACCEPT
echo "-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A FORWARD -i eth0 -o ib0 -j ACCEPT
-A FORWARD -i ib0 -o eth0 -j ACCEPT" > /etc/sysconfig/iptables

enable and start the firewall

systemctl enable --now firewalld

However this approach has "issues", as you now have a single point of 
failure on your system. TL;DR if the gateway goes away for any reason 
node ejections abound, so you can't restart it to apply security updates.

On our system it is mainly a plain Ethernet (minimum 10Gbps) GPFS fabric 
using plain TCP/IP. However the teaching HPC cluster nodes only have 
1Gbps Ethernet and 40Gbps Infiniband (they where kept from a previous 
system that used Lustre over Infiniband), so the storage goes over 
Infiniband and we hooked a spare port on the ConnectX-4 cards on the 
DSS-G nodes to the Infiniband fabric.

So the Ethernet/Infiniband gateway is only used as the nodes chat to one 
another. Further when a teaching node responds on the daemon network to 
a compute node it actually goes out the ethernet network of the node. 
You could fix that but it's complicated configuration.

This leads to the option of running a pair of nodes that will route 
between the networks and then running keepalived on the ethernet side to 
provide redundancy using VRRP to shift the gateway IP between the two 
nodes. You might be able to do the same for the Infiniband I have never 
tried, but in general it unnecessary IMHO.

I initially wanted to run this on the DSS-G nodes themselves because the 
amount of bridged traffic is tiny, 110 days since my gateway was last 
rebooted have produced a bit under 16GB of forwarded traffic. The DSS-G 
nodes are ideally placed to do the routing having loads of redundant 
Ethernet connectivity. However it turns out running keepalived on the 
DSS-G nodes is not allowed :-(

So I still have a single point of failure on the system and debating 
what to do next. Given RHEL8 has removed the driver support for the 
Intel Quickpath Infiniband cards a wholesale upgrade to 10Gbps Ethenet 
is looking attractive.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From pbasmaji at us.ibm.com  Fri Jun 18 07:50:04 2021
From: pbasmaji at us.ibm.com (IBM Storage)
Date: Fri, 18 Jun 2021 02:50:04 -0400 (EDT)
Subject: [gpfsug-discuss] Don't miss out! Get your Spectrum Scale t-shirt by
	June 21
Message-ID: <1136599776589.1135676995690.1087424846.0.290250JL.2002@scheduler.constantcontact.com>

Limited edition shirt available  Don't Miss Out! Get your limited-edition IBM Spectrum Scale t-shirt by Tuesday, June 21 Hi Spectrum Scale User Group!   First, thank you for being a valued member of the independent IBM Spectrum Scale User Group, and supporting your peers in the technical community. It's been a long time since we've gathered in person, and we hope that will change soon. I'm writing to tell you that due to COVID, we have limited-edition Spectrum Scale t-shirts available now through Tuesday, June 21, and I want to invite you to place your order directly below. After that time, we will no longer be able to distribute them directly to you. That's why I'm asking you to distribute this email in your organization before June 21 so we can get this stock into the hands of your teams, our users, customers and partners while there's still time! Only individual orders can be accepted, and up to 10 colleagues per company can receive t-shirts, if they claim them by this Tuesday. (Unfortunatey, government-owned entitles (GOEs) cannot participate.) Send My T-Shirt Send My T-Shirt If you have questions, please contact me by replying to this email.   Thank you,   Best regards, Peter M Basmajian Product Marketing and Advocacy IBM Storage *Terms and conditions apply. See website for details. IBM Storage | 425 Market Street, San Francisco, CA 94105 Unsubscribe gpfsug-discuss at spectrumscale.org Update Profile | Our Privacy Policy | Constant Contact Data Notice Sent by pbasmaji at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210618/3aebe263/attachment.htm>

From oluwasijibomi.saula at ndsu.edu  Tue Jun 22 16:17:16 2021
From: oluwasijibomi.saula at ndsu.edu (Saula, Oluwasijibomi)
Date: Tue, 22 Jun 2021 15:17:16 +0000
Subject: [gpfsug-discuss] GPFS bad at memory-mapped files?
Message-ID: <PH0PR08MB659898B39A0CE6434B98C18998099@PH0PR08MB6598.namprd08.prod.outlook.com>

Hello,

While reviewing AMS software suite for installation, I noticed this remark (https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system):

-----

GPFS file system<https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system>

Starting with AMS2019, the KF sub-system (used for handling binary files such as ADF?s TAPE* files) has been rewritten to use memory-mapped files. The mmap() system call implementation is file-system dependent and, unfortunately, it is not equally efficient in different file systems. The memory-mapped files implementation in GPFS is extremely inefficient. Therefore the users should avoid using a GPFS for scratch files

--------

Is this claim true? Are there caveats to this statement, if true?


Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology


Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>


[cid:image001.gif at 01D57DE0.91C300C0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210622/3175bbda/attachment.htm>

From S.J.Thompson at bham.ac.uk  Tue Jun 22 16:55:54 2021
From: S.J.Thompson at bham.ac.uk (Simon Thompson)
Date: Tue, 22 Jun 2021 15:55:54 +0000
Subject: [gpfsug-discuss] GPFS bad at memory-mapped files?
Message-ID: <14CCBDE7-AB03-456B-806B-6AD1A8270A7D@bham.ac.uk>

There certainly *were* issues.

See for example: http://files.gpfsug.org/presentations/2018/London/6_GPFSUG_EBI.pdf
And the follow on IBM talk on the same day: http://files.gpfsug.org/presentations/2018/London/6_MMAP_V2.pdf

And also from this year: https://www.spectrumscaleug.org/event/ssugdigital-spectrum-scale-expert-talks-update-on-performance-enhancements-in-spectrum-scale/

So it may have been true. If it still is, maybe, but it will depend on your GPFS code.

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Saula, Oluwasijibomi" <oluwasijibomi.saula at ndsu.edu>
Reply to: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, 22 June 2021 at 16:17
To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS bad at memory-mapped files?

Hello,

While reviewing AMS software suite for installation, I noticed this remark (https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system):

-----

GPFS file system

Starting with AMS2019, the KF sub-system (used for handling binary files such as ADF?s TAPE* files) has been rewritten to use memory-mapped files. The mmap() system call implementation is file-system dependent and, unfortunately, it is not equally efficient in different file systems. The memory-mapped files implementation in GPFS is extremely inefficient. Therefore the users should avoid using a GPFS for scratch files
--------

Is this claim true? Are there caveats to this statement, if true?

Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology


Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>


[cid:image001.gif at 01D57DE0.91C300C0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210622/83ae5f3d/attachment.htm>

From oluwasijibomi.saula at ndsu.edu  Tue Jun 22 17:26:52 2021
From: oluwasijibomi.saula at ndsu.edu (Saula, Oluwasijibomi)
Date: Tue, 22 Jun 2021 16:26:52 +0000
Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 113, Issue 19
In-Reply-To: <mailman.538.1624377360.1331.gpfsug-discuss@spectrumscale.org>
References: <mailman.538.1624377360.1331.gpfsug-discuss@spectrumscale.org>
Message-ID: <PH0PR08MB659878395592E6CD75362B8798099@PH0PR08MB6598.namprd08.prod.outlook.com>

Simon,

Thanks for the quick response and related information! We are at least at v5.0.5 so we shouldn't see much exposure to this issue then.


Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology


Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>


[cid:image001.gif at 01D57DE0.91C300C0]


________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of gpfsug-discuss-request at spectrumscale.org <gpfsug-discuss-request at spectrumscale.org>
Sent: Tuesday, June 22, 2021 10:56 AM
To: gpfsug-discuss at spectrumscale.org <gpfsug-discuss at spectrumscale.org>
Subject: gpfsug-discuss Digest, Vol 113, Issue 19

Send gpfsug-discuss mailing list submissions to
        gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
        gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
        gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. GPFS bad at memory-mapped files? (Saula, Oluwasijibomi)
   2. Re: GPFS bad at memory-mapped files? (Simon Thompson)


----------------------------------------------------------------------

Message: 1
Date: Tue, 22 Jun 2021 15:17:16 +0000
From: "Saula, Oluwasijibomi" <oluwasijibomi.saula at ndsu.edu>
To: "gpfsug-discuss at spectrumscale.org"
        <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS bad at memory-mapped files?
Message-ID:
        <PH0PR08MB659898B39A0CE6434B98C18998099 at PH0PR08MB6598.namprd08.prod.outlook.com>

Content-Type: text/plain; charset="windows-1252"

Hello,

While reviewing AMS software suite for installation, I noticed this remark (https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system):

-----

GPFS file system<https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system>

Starting with AMS2019, the KF sub-system (used for handling binary files such as ADF?s TAPE* files) has been rewritten to use memory-mapped files. The mmap() system call implementation is file-system dependent and, unfortunately, it is not equally efficient in different file systems. The memory-mapped files implementation in GPFS is extremely inefficient. Therefore the users should avoid using a GPFS for scratch files

--------

Is this claim true? Are there caveats to this statement, if true?


Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology


Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>


[cid:image001.gif at 01D57DE0.91C300C0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20210622/3175bbda/attachment-0001.html>

------------------------------

Message: 2
Date: Tue, 22 Jun 2021 15:55:54 +0000
From: Simon Thompson <S.J.Thompson at bham.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS bad at memory-mapped files?
Message-ID: <14CCBDE7-AB03-456B-806B-6AD1A8270A7D at bham.ac.uk>
Content-Type: text/plain; charset="utf-8"

There certainly *were* issues.

See for example: http://files.gpfsug.org/presentations/2018/London/6_GPFSUG_EBI.pdf
And the follow on IBM talk on the same day: http://files.gpfsug.org/presentations/2018/London/6_MMAP_V2.pdf

And also from this year: https://www.spectrumscaleug.org/event/ssugdigital-spectrum-scale-expert-talks-update-on-performance-enhancements-in-spectrum-scale/

So it may have been true. If it still is, maybe, but it will depend on your GPFS code.

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "Saula, Oluwasijibomi" <oluwasijibomi.saula at ndsu.edu>
Reply to: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date: Tuesday, 22 June 2021 at 16:17
To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] GPFS bad at memory-mapped files?

Hello,

While reviewing AMS software suite for installation, I noticed this remark (https://www.scm.com/doc/Installation/Additional_Information_and_Known_Issues.html#gpfs-file-system):

-----

GPFS file system

Starting with AMS2019, the KF sub-system (used for handling binary files such as ADF?s TAPE* files) has been rewritten to use memory-mapped files. The mmap() system call implementation is file-system dependent and, unfortunately, it is not equally efficient in different file systems. The memory-mapped files implementation in GPFS is extremely inefficient. Therefore the users should avoid using a GPFS for scratch files
--------

Is this claim true? Are there caveats to this statement, if true?

Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology


Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>


[cid:image001.gif at 01D57DE0.91C300C0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20210622/83ae5f3d/attachment.html>

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


End of gpfsug-discuss Digest, Vol 113, Issue 19
***********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210622/5727f975/attachment.htm>

From u.sibiller at science-computing.de  Wed Jun 23 11:10:28 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 23 Jun 2021 12:10:28 +0200
Subject: [gpfsug-discuss] mmbackup with own policy
Message-ID: <e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>

Hallo,

mmbackup offers -P to specify an own policy. Unfortunately I cannot seem to find documentation how 
that policy has to look like.

I mean, if I grab the policy generated automatically by mmbackup it looks like this:

---------------------------------------------------------
/* Auto-generated GPFS policy rules file
  * Generated on Sat May 29 15:10:46 2021
  */

/*   Server rules for backup server 1
  ***   back5_2   ***
  */
RULE EXTERNAL LIST 'mmbackup.1.back5_2' EXEC '/net/gpfso/fs1/.mmbackupCfg/BAexecScript.gpfsofs1' 
OPTS '"/net/gpfso/fs1/.mmbackupShadow.1.back5_2.filesys.update" "-servername=back5_2" 
"-auditlogname=/net/gpfso/fs1/mmbackup.audit.gpfsofs1.back5_2" "NONE"'
RULE 'BackupRule' LIST 'mmbackup.1.back5_2' DIRECTORIES_PLUS
      SHOW(VARCHAR(MODIFICATION_TIME) || ' ' || VARCHAR(CHANGE_TIME) || ' ' ||
           VARCHAR(FILE_SIZE)         || ' ' || VARCHAR(FILESET_NAME) ||
           ' ' || 'resdnt' )
      WHERE
       (
         NOT
         ( (PATH_NAME LIKE '/%/.mmbackup%') OR
           (PATH_NAME LIKE '/%/.mmLockDir' AND MODE LIKE 'd%') OR
           (PATH_NAME LIKE '/%/.mmLockDir/%') OR
           (MODE LIKE 's%')
         )
       )
       AND
         (MISC_ATTRIBUTES LIKE '%u%')
       AND
...
---------------------------------------------------------


If I want use an own policy what of all that is required for mmbackup to find the information it needs?

Uli
-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From yeep at robust.my  Wed Jun 23 12:08:20 2021
From: yeep at robust.my (T.A. Yeep)
Date: Wed, 23 Jun 2021 19:08:20 +0800
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
References: <e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
Message-ID: <CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>

Hi Dr. Martin,

You can refer to the Administrator Guide > Chapter 30 > Policies for
automating file management, or access via the link below. If you downloaded
a PDF, it starts with page 487.
https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=management-policy-rules

There a quite a number of examples in that chapter too which can help you
establish a good understanding of how to write one yourself.

On Wed, Jun 23, 2021 at 6:10 PM Ulrich Sibiller <
u.sibiller at science-computing.de> wrote:

> Hallo,
>
> mmbackup offers -P to specify an own policy. Unfortunately I cannot seem
> to find documentation how
> that policy has to look like.
>
> I mean, if I grab the policy generated automatically by mmbackup it looks
> like this:
>
> ---------------------------------------------------------
> /* Auto-generated GPFS policy rules file
>   * Generated on Sat May 29 15:10:46 2021
>   */
>
> /*   Server rules for backup server 1
>   ***   back5_2   ***
>   */
> RULE EXTERNAL LIST 'mmbackup.1.back5_2' EXEC
> '/net/gpfso/fs1/.mmbackupCfg/BAexecScript.gpfsofs1'
> OPTS '"/net/gpfso/fs1/.mmbackupShadow.1.back5_2.filesys.update"
> "-servername=back5_2"
> "-auditlogname=/net/gpfso/fs1/mmbackup.audit.gpfsofs1.back5_2" "NONE"'
> RULE 'BackupRule' LIST 'mmbackup.1.back5_2' DIRECTORIES_PLUS
>       SHOW(VARCHAR(MODIFICATION_TIME) || ' ' || VARCHAR(CHANGE_TIME) || '
> ' ||
>            VARCHAR(FILE_SIZE)         || ' ' || VARCHAR(FILESET_NAME) ||
>            ' ' || 'resdnt' )
>       WHERE
>        (
>          NOT
>          ( (PATH_NAME LIKE '/%/.mmbackup%') OR
>            (PATH_NAME LIKE '/%/.mmLockDir' AND MODE LIKE 'd%') OR
>            (PATH_NAME LIKE '/%/.mmLockDir/%') OR
>            (MODE LIKE 's%')
>          )
>        )
>        AND
>          (MISC_ATTRIBUTES LIKE '%u%')
>        AND
> ...
> ---------------------------------------------------------
>
>
> If I want use an own policy what of all that is required for mmbackup to
> find the information it needs?
>
> Uli
> --
> Science + Computing AG
> Vorstandsvorsitzender/Chairman of the board of management:
> Dr. Martin Matzke
> Vorstand/Board of Management:
> Matthias Schempp, Sabine Hohenstein
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Philippe Miltin
> Aufsichtsrat/Supervisory Board:
> Martin Wibbe, Ursula Morgenstern
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
Best regards
*T.A. Yeep*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210623/d60388ae/attachment.htm>

From stockf at us.ibm.com  Wed Jun 23 12:31:46 2021
From: stockf at us.ibm.com (Frederick Stock)
Date: Wed, 23 Jun 2021 11:31:46 +0000
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>
References: <CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>,
	<e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
Message-ID: <OFCEC0B6D2.6A00858D-ON002586FD.003F2593-002586FD.003F555C@ibm.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210623/46d93d49/attachment.htm>

From u.sibiller at science-computing.de  Wed Jun 23 13:19:59 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 23 Jun 2021 14:19:59 +0200
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>
References: <e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
	<CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>
Message-ID: <c3f34b43-0749-cce3-e13e-80572364e45b@science-computing.de>

On 6/23/21 1:08 PM, T.A. Yeep wrote:
> You can refer to the Administrator Guide > Chapter 30 > Policies for automating file management, or 
> access via the link below. If you downloaded a?PDF, it starts with page?487.
> https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=management-policy-rules 
> <https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=management-policy-rules>
> 
> There a quite a number of examples in that chapter too which can help you establish a good 
> understanding of how to write?one yourself.

Thanks, I know how to write policies. I just had the impression that regarding mmbackup the policy 
has to follow certain rules to satisfy mmbackup requirements.

Kind regards,

Uli

-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From u.sibiller at science-computing.de  Wed Jun 23 15:15:53 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Wed, 23 Jun 2021 16:15:53 +0200
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <OFCEC0B6D2.6A00858D-ON002586FD.003F2593-002586FD.003F555C@ibm.com>
References: <CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>
	<e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
	<OFCEC0B6D2.6A00858D-ON002586FD.003F2593-002586FD.003F555C@ibm.com>
Message-ID: <9125eaa3-5b1e-9fb7-8747-7ce6ef3e528a@science-computing.de>


On 6/23/21 1:31 PM, Frederick Stock wrote:
> The only requirement for your own backup policy is that it finds the files you want to back up and 
> skips those that you do not want to back up.? It is no different than any policy that you would use 
> with the GPFS policy engine.

Have you ever sucessfully done this?

Let me explain again:
With an own policy I know what I want to see in the output and for an arbitrary external rule I know 
where the called script resides and what parameters it expects.

mmbackup however creates a helper script (BAexecScript.<filesys>) on the fly and calls that from its 
autogenerated policy via the external rule line. I assume I need the BAexecScript to make mmbackup 
behave like it should. Is this wrong?

Update: after playing around for quite some time it looks the rule must have a special format, as 
shown below. Replace the placeholders like this:
<backupserver>   the name of the Backupserver as defined in dsm.sys and on the mmbackup command line 
via --tsm-servers
<fsname>         as shown in mmlsfs all_local (without /dev/)
<mountpoint>     as shown in mmlsfs -T

--------------------------------------------------------------------------------------------
RULE EXTERNAL LIST 'mmbackup.1.<backupserver>' EXEC 
'<mountpoint>/.mmbackupCfg/BAexecScript.<fsname>' OPTS 
'"<mountpoint>/.mmbackupShadow.1.<backupserver>.filesys.update" "-servername=<backupserver>" 
"-auditlogname=<mountpoint>/mmbackup.audit.<fsname>.<backupserver>" "NONE"'

RULE 'BackupRule' LIST 'mmbackup.1.<backupserver>' DIRECTORIES_PLUS
      SHOW(VARCHAR(MODIFICATION_TIME) || ' ' || VARCHAR(CHANGE_TIME) || ' ' ||
           VARCHAR(FILE_SIZE)         || ' ' || VARCHAR(FILESET_NAME) ||
           ' ' || 'resdnt' )
   WHERE ( KB_ALLOCATED > 1024 )
--------------------------------------------------------------------------------------------

Call it like this:

/usr/lpp/mmfs/bin/mmbackup  <mountpoint> --tsm-servers <backupserver> -P <policyfile>


As this non-trivial it should be mentioned in the documentation!

Uli

-- 
Dipl.-Inf. Ulrich Sibiller           science + computing ag
System Administration                    Hagellocher Weg 73
Hotline +49 7071 9457 681          72070 Tuebingen, Germany
                           https://atos.net/de/deutschland/sc
-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From wsawdon at us.ibm.com  Wed Jun 23 22:52:48 2021
From: wsawdon at us.ibm.com (Wayne Sawdon)
Date: Wed, 23 Jun 2021 21:52:48 +0000
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <9125eaa3-5b1e-9fb7-8747-7ce6ef3e528a@science-computing.de>
References: <9125eaa3-5b1e-9fb7-8747-7ce6ef3e528a@science-computing.de>,
	<CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com><e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de><OFCEC0B6D2.6A00858D-ON002586FD.003F2593-002586FD.003F555C@ibm.com>
Message-ID: <OF776F1932.6B0F828E-ON002586FD.0075CF58-002586FD.007830FF@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210623/ca993951/attachment.htm>

From u.sibiller at science-computing.de  Thu Jun 24 11:24:30 2021
From: u.sibiller at science-computing.de (Ulrich Sibiller)
Date: Thu, 24 Jun 2021 12:24:30 +0200
Subject: [gpfsug-discuss] mmbackup with own policy
In-Reply-To: <OF776F1932.6B0F828E-ON002586FD.0075CF58-002586FD.007830FF@notes.na.collabserv.com>
References: <9125eaa3-5b1e-9fb7-8747-7ce6ef3e528a@science-computing.de>
	<CAFaHHOx=erwmLy5sRuMuUhmAf6f0GSsGE32V_LoXAMAoK9hFpw@mail.gmail.com>
	<e3ee32a6-0e26-0e77-5c08-822034d64362@science-computing.de>
	<OFCEC0B6D2.6A00858D-ON002586FD.003F2593-002586FD.003F555C@ibm.com>
	<OF776F1932.6B0F828E-ON002586FD.0075CF58-002586FD.007830FF@notes.na.collabserv.com>
Message-ID: <73732efb-28bd-d7bd-b8b5-b1aace37f533@science-computing.de>


On 6/23/21 11:52 PM, Wayne Sawdon wrote:
> At a higher level what are you trying to do? Include some directories and exclude others? Use a 
> different backup server? What do you need that is not there?

The current situation is that a customer insists on using different management classes on the TSM 
server for big files than for small files. We have setup additional server stanzas 
<backupserver>_big and <backupserver>_small representing the management classes.

servername         <backupserver>_small
...
inclexcl           .../dsm.inclexcl.small
...


# cat .../dsm.inclexcl.small
include ... smallfile

("smallfile" is the management class)

Now we need to run mmbackup against the <backupserver>_small and restrict ist to only treat the 
small files and ignore the bigger ones. As we cannot determine them by name or path we need to use a 
policy (.. WHERE KB_ALLOCATED < something)

> mmbackup is tied to Spectrum Protect (formerly known as TSM) and gets its include/excludes from the 
> TSM option files. It constructs a policy to list all of the files & directories and includes 
> attributes such as mtime and ctime. It then compares this list of files to the "shadow database" 
> which is a copy of what the TSM database has. This comparison produces 3 lists of files: new files & 
> files that have the data changed, existing files that only have attributes changed and a list of 
> files that were deleted. Each list is sent to Spectrum Protect to either backup the data, or to 
> update the metadata or to mark the file as deleted. As Spectrum Protect acknowledges the operation 
> on each file, we update the shadow database to keep it current.

Exactly. I am aware of that.

> So writing a new policy file for mmbackup is not really as simple as it seems. I don't? think you 
> can change the record format on the list of files. And don't override the encoding on special 
> characters. And I'm sure there are other Gotchas as well.

That's just what I wanted to express with my previous mails. It is not as simple as it seems AND it 
is not documented. We want to use all that fancy shadow file management that mmbackup comes with 
because it is sophisticated nowadays and generally works flawlessly. We do not want to reinvent the 
mmbackup wheel.

So for the current situation having a simple way to (partly) replace the WHERE clause would be of 
great help. I acknowledge that offering that in a general way could get complicated for the 
developers. Having a documentation how to write a policy that matches what mmbackup without -P is 
doing is the first step the improve the situation. My posted policy currently works but it is 
questionable how long that will be the case. Once mmbackup changes its internal behaviour it will 
break..

Uli
-- 
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

From chair at spectrumscale.org  Mon Jun 28 09:09:39 2021
From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair))
Date: Mon, 28 Jun 2021 09:09:39 +0100
Subject: [gpfsug-discuss] SSUG::Digital: Spectrum Scale Container Native
	Storage Access (CNSA)
Message-ID: <>

An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210628/d3275165/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: meeting.ics
Type: text/calendar
Size: 2338 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210628/d3275165/attachment.ics>

From ewahl at osc.edu  Mon Jun 28 21:00:35 2021
From: ewahl at osc.edu (Wahl, Edward)
Date: Mon, 28 Jun 2021 20:00:35 +0000
Subject: [gpfsug-discuss] GUI refresh task error
In-Reply-To: <c3f88e7a-1809-dbd0-b71e-296f064775d2@docum.org>
References: <72d50b96-c6a3-f075-8f47-84bf2346f0ae@docum.org>
	<975f874a066c4ba6a45c62f9b280efa2@postbank.de>
	<c3f88e7a-1809-dbd0-b71e-296f064775d2@docum.org>
Message-ID: <DM5PR0101MB2924E67578BED8362774A236A8039@DM5PR0101MB2924.prod.exchangelabs.com>

Curious if this was ever fixed or someone has an APAR # ?     I'm still running into it on 5.0.5.6 

Ed Wahl
OSC


-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Stef Coene
Sent: Thursday, July 16, 2020 9:47 AM
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] GUI refresh task error

Ok, thanx for the answer.

I will wait for the fix.


Stef

On 2020-07-16 15:25, Roland Schuemann wrote:
> Hi Stef,
> 
> we already recognized this error too and opened a PMR/Case at IBM.
> You can set this task to inactive, but this is not persistent. After gui restart it comes again.
> 
> This was the answer from IBM Support.
>>>>>>>>>>>>>>>>>>
> This will be fixed in the next release of 5.0.5.2, right now there is no work-around but will not cause issue besides the cosmetic task failed message.
> Is this OK for you?
>>>>>>>>>>>>>>>>>>
> 
> So we ignore (Gui is still degraded) it and wait for the fix.
> 
> Kind regards
>   Roland Sch?mann
> 
> 
> Freundliche Gr??e / Kind regards
> Roland Sch?mann
> 
> ____________________________________________
> 
> Roland Sch?mann
> Infrastructure Engineering (BTE)
> CIO PB Germany
> 
> Deutsche Bank I Technology, Data and Innovation Postbank Systems AG
> 
> 
> -----Urspr?ngliche Nachricht-----
> Von: gpfsug-discuss-bounces at spectrumscale.org 
> <gpfsug-discuss-bounces at spectrumscale.org> Im Auftrag von Stef Coene
> Gesendet: Donnerstag, 16. Juli 2020 15:14
> An: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Betreff: [gpfsug-discuss] GUI refresh task error
> 
> Hi,
> 
> On brand new 5.0.5 cluster we have the following errors on all nodes:
> "The following GUI refresh task(s) failed: WATCHFOLDER"
> 
> It also says
> "Failure reason:	Command mmwatch all functional --list-clustered-status
> failed"
> 
> Running mmwatch manually gives:
> mmwatch: The Clustered Watch Folder function is only available in the IBM Spectrum Scale Advanced Edition or the Data Management Edition.
> mmwatch: Command failed. Examine previous error messages to determine cause.
> 
> How can I get rid of this error?
> 
> I tried to disable the task with:
> chtask WATCHFOLDER --inactive
> EFSSG1811C The task with the name WATCHFOLDER is already not scheduled.
> 
> 
> Stef
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-
> discuss__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_
> gi9hZJP8mT$ Die Europ?ische Kommission hat unter 
> https://urldefense.com/v3/__http://ec.europa.eu/consumers/odr/__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_gi9m0qpNP9$  eine Europ?ische Online-Streitbeilegungsplattform (OS-Plattform) errichtet. Verbraucher k?nnen die OS-Plattform f?r die au?ergerichtliche Beilegung von Streitigkeiten aus Online-Vertr?gen mit in der EU niedergelassenen Unternehmen nutzen.
> 
> Informationen (einschlie?lich Pflichtangaben) zu einzelnen, innerhalb der EU t?tigen Gesellschaften und Zweigniederlassungen des Konzerns Deutsche Bank finden Sie unter https://urldefense.com/v3/__https://www.deutsche-bank.de/Pflichtangaben__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_gi9sgMU2R_$ . Diese E-Mail enth?lt vertrauliche und/ oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.
> 
> The European Commission has established a European online dispute resolution platform (OS platform) under https://urldefense.com/v3/__http://ec.europa.eu/consumers/odr/__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_gi9m0qpNP9$ . Consumers may use the OS platform to resolve disputes arising from online contracts with providers established in the EU.
> 
> Please refer to https://urldefense.com/v3/__https://www.db.com/disclosures__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_gi9nXBvg8r$  for information (including mandatory corporate particulars) on selected Deutsche Bank branches and group companies registered or incorporated in the European Union. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-
> discuss__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_
> gi9hZJP8mT$
> 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!iZZSS4baXvM4hp_EgmAlElMFeU23jbACq1CMPtkf-Q5ShrsQv_gi9hZJP8mT$ 

From jonathan.buzzard at strath.ac.uk  Tue Jun 29 14:46:45 2021
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Tue, 29 Jun 2021 14:46:45 +0100
Subject: [gpfsug-discuss] PVU question
Message-ID: <b16d7d0a-490d-8c2a-1b22-cb0cee69815b@strath.ac.uk>


Hum, it would appear there are gaps in IBM's PVU table. Specifically I 
am looking at using a Pentium G4620 in a server

https://ark.intel.com/content/www/us/en/ark/products/97460/intel-pentium-processor-g4620-3m-cache-3-70-ghz.html

It's dual core with ECC memory support all in a socket 1151. While a low 
spec it would be an upgrade from the Xeon E3113 currently in use and 
more than adequate for the job. A quad code CPU would more than double 
the PVU for no performance gain so I am not keen to go there.

The only reason for the upgrade is the hardware is now getting on and 
finding spares on eBay is now getting hard (it's a Dell PowerEdge R300).

However it doesn't fit anywhere in the PVU table

https://www.ibm.com/software/passportadvantage/pvu_licensing_for_customers.html

It's not a Xeon, it's not a Core, it's not AMD and it's not single core. 
It won't be in a laptop, desktop or workstation so that rules out that 
PVU calculation.

Does that mean zero PVU :-) or it's not supported or what?

Customer support where hopeless in answering my query. Then again IBM 
think I need GDPR stickers for returning a memory DIMM.


JAB

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From scale at us.ibm.com  Tue Jun 29 15:41:14 2021
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Tue, 29 Jun 2021 10:41:14 -0400
Subject: [gpfsug-discuss] PVU question
In-Reply-To: <b16d7d0a-490d-8c2a-1b22-cb0cee69815b@strath.ac.uk>
References: <b16d7d0a-490d-8c2a-1b22-cb0cee69815b@strath.ac.uk>
Message-ID: <OFD3709FAC.CCF0E36A-ON85258703.004FF106-85258703.0050ADE7@ibm.com>

My suggestion for this question is that it should be directed to your IBM 
sales team and not the Spectrum Scale support team.  My reading of the 
information you provided is that your processor counts as 2 cores.  As for 
the PVU value my guess is that at a minimum it is 50 but again that should 
be a question for your IBM sales team.

One other option is to switch from processor based licensing for Scale to 
storage (TB) based licensing.  I think one of the reasons for storage 
based licensing was to avoid issues like the one you are raising.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.


From:   Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   06/29/2021 09:47 AM
Subject:        [EXTERNAL] [gpfsug-discuss] PVU question
Sent by:        gpfsug-discuss-bounces at spectrumscale.org


Hum, it would appear there are gaps in IBM's PVU table. Specifically I 
am looking at using a Pentium G4620 in a server

https://ark.intel.com/content/www/us/en/ark/products/97460/intel-pentium-processor-g4620-3m-cache-3-70-ghz.html 


It's dual core with ECC memory support all in a socket 1151. While a low 
spec it would be an upgrade from the Xeon E3113 currently in use and 
more than adequate for the job. A quad code CPU would more than double 
the PVU for no performance gain so I am not keen to go there.

The only reason for the upgrade is the hardware is now getting on and 
finding spares on eBay is now getting hard (it's a Dell PowerEdge R300).

However it doesn't fit anywhere in the PVU table

https://www.ibm.com/software/passportadvantage/pvu_licensing_for_customers.html


It's not a Xeon, it's not a Core, it's not AMD and it's not single core. 
It won't be in a laptop, desktop or workstation so that rules out that 
PVU calculation.

Does that mean zero PVU :-) or it's not supported or what?

Customer support where hopeless in answering my query. Then again IBM 
think I need GDPR stickers for returning a memory DIMM.


JAB

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210629/bb543264/attachment.htm>