From taylorm at us.ibm.com Mon Aug 1 17:42:19 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 1 Aug 2016 09:42:19 -0700 Subject: [gpfsug-discuss] Spectrum Scale 4.2.1 Released In-Reply-To: References: Message-ID: Thanks for sharing Bob. Since some folks asked previously, if you go to the 4.2.1 FAQ PDF version there will be change bars on the left for what changed in FAQ from previous version as well as a FAQ July updates table near the top to quickly highlight the changes from last FAQ. http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.pdf?view=kc Also, two short blogs on the 4.2.1 release on the Storage Community might be of interest: http://storagecommunity.org/easyblog -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 19:36:15 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 14:36:15 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) Message-ID: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. From oehmes at gmail.com Mon Aug 1 19:49:37 2016 From: oehmes at gmail.com (Sven Oehme) Date: Mon, 1 Aug 2016 11:49:37 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: > I have enabled write cache (HAWC) by running the below commands. The > recovery logs are supposedly placed in the replicated system metadata pool > (SSDs). I do not have a "system.log" pool as it is only needed if recovery > logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including the NSD > nodes). > > I still see small synchronous writes (4K) from the clients going to the > data drives (data pool). I am checking this by looking at "mmdiag --iohist" > output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is working as > advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 20:05:52 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 15:05:52 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), they don't > leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao > wrote: > > I have enabled write cache (HAWC) by running the below commands. > The recovery logs are supposedly placed in the replicated system > metadata pool (SSDs). I do not have a "system.log" pool as it is > only needed if recovery logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including > the NSD nodes). > > I still see small synchronous writes (4K) from the clients going > to the data drives (data pool). I am checking this by looking at > "mmdiag --iohist" output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is > working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From dhildeb at us.ibm.com Mon Aug 1 20:50:09 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 12:50:09 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ?? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From raot at bnl.gov Mon Aug 1 21:42:06 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 16:42:06 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: > > Hi Tejas, > > Do you know the workload in the VM? > > The workload which enters into HAWC may or may not be the same as the > workload that eventually goes into the data pool....it all depends on > whether the 4KB writes entering HAWC can be coalesced or not. For > example, sequential 4KB writes can all be coalesced into a single > large chunk. So 4KB writes into HAWC will convert into 8MB writes to > data pool (in your system). But random 4KB writes into HAWC may end up > being 4KB writes into the data pool if there are no adjoining 4KB > writes (i.e., if 4KB blocks are all dispersed, they can't be > coalesced). The goal of HAWC though, whether the 4KB blocks are > coalesced or not, is to reduce app latency by ensuring that writing > the blocks back to the data pool is done in the background. So while > 4KB blocks may still be hitting the data pool, hopefully the > application is seeing the latency of your presumably lower latency > system pool. > > Dean > > > Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my > case GPFS storage is used to store VM images (KVM) and heTejas Rao > ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store > VM images (KVM) and hence the small IO. > > From: Tejas Rao > To: gpfsug main discussion list > Date: 08/01/2016 12:06 PM > Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > In my case GPFS storage is used to store VM images (KVM) and hence the > small IO. > > I always see lots of small 4K writes and the GPFS filesystem block > size is 8MB. I thought the reason for the small writes is that the > linux kernel requests GPFS to initiate a periodic sync which by > default is every 5 seconds and can be controlled by > "vm.dirty_writeback_centisecs". > > I thought HAWC would help in such cases and would harden (coalesce) > the small writes in the "system" pool and would flush to the "data" > pool in larger block size. > > Note - I am not doing direct i/o explicitly. > > > > On 8/1/2016 14:49, Sven Oehme wrote: > > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), > they don't leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao <_raot at bnl.gov_ > > wrote: > I have enabled write cache (HAWC) by running the below > commands. The recovery logs are supposedly placed in the > replicated system metadata pool (SSDs). I do not have a > "system.log" pool as it is only needed if recovery logs > are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster > (including the NSD nodes). > > I still see small synchronous writes (4K) from the clients > going to the data drives (data pool). I am checking this > by looking at "mmdiag --iohist" output. Should they not be > going to the system pool? > > Do I need to do something else? How can I confirm that > HAWC is working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at _spectrumscale.org_ > _ > __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 105 bytes Desc: not available URL: From dhildeb at us.ibm.com Mon Aug 1 21:55:28 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 13:55:28 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov><5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> Message-ID: Hi Tejas, Yes, most likely those 4k writes are the HAWC writes...hopefully those 4KB writes have a lower latency than the 4k writes to your data pool so you are realizing the benefits. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 01:42 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and heTejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and hence the small IO. From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Wed Aug 3 06:06:32 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 05:06:32 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain Message-ID: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Wed Aug 3 08:32:43 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 07:32:43 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain In-Reply-To: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> References: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> Message-ID: <663114b24b0b403aa076a83791f32c58@exch1-cdc.nexus.csiro.au> And I am seeing the same behaviour on a SLES 12 SP1 update from 4.2.04 to 4.2.1.0. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au Sent: Wednesday, 3 August 2016 3:07 PM To: gpfsug-discuss at spectrumscale.org Subject: [ExternalEmail] [gpfsug-discuss] SS 4.2.1.0 upgrade pain On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Wed Aug 3 09:54:30 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Wed, 3 Aug 2016 10:54:30 +0200 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Message-ID: <57A1B146.9070505@ugent.be> Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth From bbanister at jumptrading.com Wed Aug 3 15:53:52 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 3 Aug 2016 14:53:52 +0000 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 In-Reply-To: <57A1B146.9070505@ugent.be> References: <57A1B146.9070505@ugent.be> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062B3718@CHI-EXCHANGEW1.w2k.jumptrading.com> Your first process is correct. Install the 4.2.0-0 rpms first, then install the 4.2.1 rpms after. -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman Sent: Wednesday, August 03, 2016 3:55 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From pinto at scinet.utoronto.ca Wed Aug 3 17:22:27 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:22:27 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? Message-ID: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From oehmes at gmail.com Wed Aug 3 17:35:39 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 09:35:39 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for that > user? > > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user be > stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 17:41:24 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:41:24 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803124124.21815zz1w4exmuus@support.scinet.utoronto.ca> Quoting "Sven Oehme" : > Hi, > > quotas are only counted against primary group > > sven Thanks Sven I kind of suspected, but needed an independent confirmation. Jaime > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: > >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for that >> user? >> >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user be >> stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> >> ---------------------------------------------------------------- >> This message was sent using IMP at SciNet Consortium, University of >> Toronto. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From jonathan at buzzard.me.uk Wed Aug 3 17:44:01 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 17:44:01 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> On 03/08/16 17:22, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for > that user? Nope that's not how POSIX schematics work for group quotas. As far as I can tell only your primary group is used for group quotas. It basically makes group quotas in Unix a waste of time in my opinion. At least I have never come across a real world scenario where they work in a useful manner. > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user > be stopped by gpfs? > File sets are the answer to your problems, but retrospectively applying them to a file system is a pain. You create a file set for a directory and can then apply a quota to the file set. Even better you can apply per file set user and group quotas. So if file set A has a 1TB quota you could limit user X to 100GB in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From pinto at scinet.utoronto.ca Wed Aug 3 17:55:43 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:55:43 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for >> that user? > > Nope that's not how POSIX schematics work for group quotas. As far as I > can tell only your primary group is used for group quotas. It basically > makes group quotas in Unix a waste of time in my opinion. At least I > have never come across a real world scenario where they work in a > useful manner. > >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user >> be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively applying > them to a file system is a pain. You create a file set for a directory > and can then apply a quota to the file set. Even better you can apply > per file set user and group quotas. So if file set A has a 1TB quota > you could limit user X to 100GB in the file set, but outside the file > set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:06:34 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:06:34 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 19:30:08 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 14:30:08 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Quoting "Buterbaugh, Kevin L" : > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > > Thanks for clarifying? > > Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary group > as the attribute, and that overall group quota is reached, will > that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:34:21 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:34:21 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Kevin On Aug 3, 2016, at 1:30 PM, Jaime Pinto > wrote: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 19:46:54 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 19:46:54 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:55:01 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:55:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: JAB, The set group id bit is tangential to my point. I expect GPFS to count any files a user owns against their user quota. If they are a member of multiple groups then I also expect it to count it against the group quota of whatever group is associated with that file. I.e., if they do a chgrp then GPFS should subtract from one group and add to another. Kevin On Aug 3, 2016, at 1:46 PM, Jonathan Buzzard > wrote: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 20:13:09 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 20:13:09 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Message-ID: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 20:18:11 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 19:18:11 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> Message-ID: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> JAB, Our scratch filesystem uses user and group quotas. It started out as a traditional scratch filesystem but then we decided (for better or worse) to allow groups to purchase quota on it (and we don?t purge it, as many sites do). We have many users in multiple groups, so if this is not working right it?s a potential issue for us. But you?re right, I?m a nobody? Kevin On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: On 03/08/16 19:34, Buterbaugh, Kevin L wrote: Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at gmail.com Wed Aug 3 21:32:32 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 13:32:32 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> Message-ID: i can't contribute much to the usefulness of tracking primary or secondary group. depending on who you ask you get a 50/50 answer why its great or broken either way. Jonathan explanation was correct, we only track/enforce primary groups , we don't do anything with secondary groups in regards to quotas. if there is 'doubt' of correct quotation of files on the disk in the filesystem one could always run mmcheckquota, its i/o intensive but will match quota usage of the in memory 'assumption' and update it from the actual data thats stored on disk. sven On Wed, Aug 3, 2016 at 12:18 PM, Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > JAB, > > Our scratch filesystem uses user and group quotas. It started out as a > traditional scratch filesystem but then we decided (for better or worse) to > allow groups to purchase quota on it (and we don?t purge it, as many sites > do). > > We have many users in multiple groups, so if this is not working right > it?s a potential issue for us. But you?re right, I?m a nobody? > > Kevin > > On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: > > On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > > > Even if Jamie's interpretation is wrong it shows the other massive failure > of group quotas under Unix and why they are not fit for purpose in the real > world. > > So bufh here can deliberately or accidentally do a denial of service on > other users and tracking down the offending user is a right pain in the > backside. > > The point of being able to change group ownership on a file is to indicate > the massive weakness of the whole group quota system, and why in my > experience nobody actually uses it, and "project" quota options have been > implemented in many "enterprise" Unix file systems. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Aug 4 00:03:47 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 23:03:47 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> Message-ID: <762ff4f5796c4992b3bceb23b26fdbf3@exch1-cdc.nexus.csiro.au> The GID selection rules for account creation are Linux distribution specific. It sounds like you are familiar with Red Hat, where I think this idea of GID=UID started. sles12sp1-brc:/dev/disk/by-uuid # useradd testout sles12sp1-brc:/dev/disk/by-uuid # grep testout /etc/passwd testout:x:1001:100::/home/testout:/bin/bash sles12sp1-brc:/dev/disk/by-uuid # grep 100 /etc/group users:x:100: sles12sp1-brc:/dev/disk/by-uuid # Cheers, Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jaime Pinto Sent: Thursday, 4 August 2016 2:56 AM To: gpfsug main discussion list ; Jonathan Buzzard Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP >> is not the primary group. Will gpfs enforce the secondary group quota >> for that user? > > Nope that's not how POSIX schematics work for group quotas. As far as > I can tell only your primary group is used for group quotas. It > basically makes group quotas in Unix a waste of time in my opinion. At > least I have never come across a real world scenario where they work > in a useful manner. > >> What I mean is, if the user keeps writing files with secondary group >> as the attribute, and that overall group quota is reached, will that >> user be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively > applying them to a file system is a pain. You create a file set for a > directory and can then apply a quota to the file set. Even better you > can apply per file set user and group quotas. So if file set A has a > 1TB quota you could limit user X to 100GB in the file set, but outside > the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Greg.Lehmann at csiro.au Thu Aug 4 03:41:55 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 4 Aug 2016 02:41:55 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation Message-ID: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Thu Aug 4 09:13:29 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Thu, 4 Aug 2016 10:13:29 +0200 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: <57A2F929.8000003@ugent.be> This is new, it is explained how they are merged at http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1xx_soc.htm Cheers! K On 04/08/16 04:41, Greg.Lehmann at csiro.au wrote: > > I see only 4 pdfs now with slightly different titles to the previous 5 > pdfs available with 4.2.0. Just checking there are only supposed to be > 4 now? > > Greg > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Thu Aug 4 09:13:51 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:13:51 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Thu Aug 4 09:17:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:17:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: Ah. Dependent vs independent. (10,000 and 1000 respectively). -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 04 August 2016 09:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From st.graf at fz-juelich.de Thu Aug 4 09:20:42 2016 From: st.graf at fz-juelich.de (Stephan Graf) Date: Thu, 4 Aug 2016 10:20:42 +0200 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <57A2FADA.1060508@fz-juelich.de> Hi! I have tested it with dependent filesets in GPFS 4.1.1.X and there the limit is 10.000. Stephan On 08/04/16 10:13, Sobey, Richard A wrote: > 1000 isn't it?! We've always worked on that assumption. > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard > Sent: 03 August 2016 17:44 > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? > in the file set, but outside the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > -- Stephan Graf Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Phone: +49-2461-61-6578 Fax: +49-2461-61-6656 E-mail: st.graf at fz-juelich.de WWW: http://www.fz-juelich.de/jsc/ ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From daniel.kidger at uk.ibm.com Thu Aug 4 09:22:36 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 4 Aug 2016 08:22:36 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? GregUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 16:59:31 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 11:59:31 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" : > Quoting "Buterbaugh, Kevin L" : > >> Hi Sven, >> >> Wait - am I misunderstanding something here? Let?s say that I have >> ?user1? who has primary group ?group1? and secondary group >> ?group2?. And let?s say that they write to a directory where the >> bit on the directory forces all files created in that directory to >> have group2 associated with them. Are you saying that those >> files still count against group1?s group quota??? >> >> Thanks for clarifying? >> >> Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > >> >> On Aug 3, 2016, at 11:35 AM, Sven Oehme >> > wrote: >> >> Hi, >> >> quotas are only counted against primary group >> >> sven >> >> >> On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto >> > wrote: >> Suppose I want to set both USR and GRP quotas for a user, however >> GRP is not the primary group. Will gpfs enforce the secondary group >> quota for that user? >> >> What I mean is, if the user keeps writing files with secondary >> group as the attribute, and that overall group quota is reached, >> will that user be stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - >> www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 4 17:08:30 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 4 Aug 2016 16:08:30 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Message-ID: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Hi Jaime, Thank you sooooo much for doing this and reporting back the results! They?re in line with what I would expect to happen. I was going to test this as well, but we have had to extend our downtime until noontime tomorrow, so I haven?t had a chance to do so yet. Now I don?t have to? ;-) Kevin On Aug 4, 2016, at 10:59 AM, Jaime Pinto > wrote: Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" >: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 17:34:09 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 12:34:09 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Message-ID: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 10 22:00:26 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 10 Aug 2016 21:00:26 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 10 22:04:11 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 10 Aug 2016 21:04:11 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? In-Reply-To: References: Message-ID: <95126B16-B4DB-4406-862B-AA81E37F04E6@nuance.com> We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Buterbaugh, Kevin L" Reply-To: gpfsug main discussion list Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Wed Aug 10 22:43:15 2016 From: malone12 at illinois.edu (Maloney, John Daniel) Date: Wed, 10 Aug 2016 21:43:15 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: <4AD486D7-D452-465A-85EC-1BDDE2C5DCFD@illinois.edu> Hi Bob, Thanks for the update! The couple storage folks from NCSA going to SC16 won?t be available Sunday (I?m not able to get in until Monday morning). Agree completely there is never a perfect time, just giving our feedback. Thanks again, J.D. Maloney Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: > on behalf of "Oesterlin, Robert" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:04 PM To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] User group meeting at SC16? We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Buterbaugh, Kevin L" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Aug 11 05:47:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 11 Aug 2016 00:47:17 -0400 Subject: [gpfsug-discuss] GPFS and SELinux Message-ID: Hi Everyone, I'm passing this along on behalf of one of our security guys. Just wondering what feedback/thoughts others have on the topic. Current IBM guidance on GPFS and SELinux indicates that the default context for services (initrc_t) is insufficient for GPFS operations. See: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/Using+GPFS+with+SElinux That part is true (by design), but IBM goes further to say use runcon out of rc.local and configure the gpfs service to not start via init. I believe these latter two (rc.local/runcon and no-init) can be addressed, relatively trivially, through the application of a small selinux policy. Ideally, I would hope for IBM to develop, test, and send out the policy, but I'm happy to offer the following suggestions. I believe "a)" could be developed in a relatively short period of time. "b)" would take more time, effort and experience. a) consider SELinux context transition. As an example, consider: https://github.com/TresysTechnology/refpolicy/tree/master/policy/modules/services (specifically, the ssh components) On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, and runs under sshd_t Referencing ssh.te, you see several references to sshd_exec_t in: domtrans_pattern init_daemon_domain daemontools_service_domain (and so on) These configurations allow init to fire sshd off, setting its runtime context to sshd_t, based on the file context of sshd_exec_t. This should be duplicable for the gpfs daemon, altho I note it seems to be fired through a layer of abstraction in mmstartup. A simple policy that allows INIT to transition GPFS to unconfined_t would go a long way towards easing integration. b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when executed, would pick up a context of gpfs_t? Which then could be mapped through standard SELinux policy to allow access to configuration files (gpfs_etc_t?), block devices, etc? I admit, in b, I am speculating heavily. -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From janfrode at tanso.net Thu Aug 11 10:54:27 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 11 Aug 2016 11:54:27 +0200 Subject: [gpfsug-discuss] GPFS and SELinux In-Reply-To: References: Message-ID: I believe the runcon part is no longer necessary, at least on my RHEL7 based systems mmfsd is running unconfined by default: [root at flexscale01 ~]# ps -efZ|grep mmfsd unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 18018 17709 0 aug.05 ? 00:24:53 /usr/lpp/mmfs/bin/mmfsd and I've never seen any problems with that for base GPFS. I suspect doing a proper selinux domain for GPFS will be quite close to unconfined, so maybe not worth the effort... -jf On Thu, Aug 11, 2016 at 6:47 AM, Aaron Knister wrote: > Hi Everyone, > > I'm passing this along on behalf of one of our security guys. Just > wondering what feedback/thoughts others have on the topic. > > > Current IBM guidance on GPFS and SELinux indicates that the default > context for services (initrc_t) is insufficient for GPFS operations. > > See: > https://www.ibm.com/developerworks/community/wikis/home? > lang=en#!/wiki/General+Parallel+File+System+(GPFS)/ > page/Using+GPFS+with+SElinux > > > That part is true (by design), but IBM goes further to say use runcon > out of rc.local and configure the gpfs service to not start via init. > > I believe these latter two (rc.local/runcon and no-init) can be > addressed, relatively trivially, through the application of a small > selinux policy. > > Ideally, I would hope for IBM to develop, test, and send out the policy, > but I'm happy to offer the following suggestions. I believe "a)" could > be developed in a relatively short period of time. "b)" would take more > time, effort and experience. > > a) consider SELinux context transition. > > As an example, consider: > https://github.com/TresysTechnology/refpolicy/tree/master/ > policy/modules/services > > > (specifically, the ssh components) > > On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, > and runs under sshd_t > > Referencing ssh.te, you see several references to sshd_exec_t in: > domtrans_pattern > init_daemon_domain > daemontools_service_domain > (and so on) > > These configurations allow init to fire sshd off, setting its runtime > context to sshd_t, based on the file context of sshd_exec_t. > > This should be duplicable for the gpfs daemon, altho I note it seems to > be fired through a layer of abstraction in mmstartup. > > A simple policy that allows INIT to transition GPFS to unconfined_t > would go a long way towards easing integration. > > b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when > executed, would pick up a context of gpfs_t? Which then could be mapped > through standard SELinux policy to allow access to configuration files > (gpfs_etc_t?), block devices, etc? > > I admit, in b, I am speculating heavily. > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Fri Aug 12 20:40:27 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Fri, 12 Aug 2016 19:40:27 +0000 Subject: [gpfsug-discuss] HPCwire Readers Choice Message-ID: Reminder... Get your stories in today! To view this email in your browser, click here. Last Call for Readers' Choice Award Nominations! Deadline: Friday, August 12th at 11:50pm! Only 3 days left until nominations for the 2016 HPCwire Readers' Choice Awards come to a close! Be sure to submit your picks for the best in HPC and make your voice heard before it's too late! These annual awards are a way for our community to recognize the best and brightest innovators within the global HPC community. Time is running out for you to nominate what you think are the greatest achievements in HPC for 2016, so cast your ballot today! The 2016 Categories Include the Following: * Best Use of HPC Application in Life Sciences * Best Use of HPC Application in Manufacturing * Best Use of HPC Application in Energy (previously 'Oil and Gas') * Best Use of HPC in Automotive * Best Use of HPC in Financial Services * Best Use of HPC in Entertainment * Best Use of HPC in the Cloud * Best Use of High Performance Data Analytics * Best Implementation of Energy-Efficient HPC * Best HPC Server Product or Technology * Best HPC Storage Product or Technology * Best HPC Software Product or Technology * Best HPC Visualization Product or Technology * Best HPC Interconnect Product or Technology * Best HPC Cluster Solution or Technology * Best Data-Intensive System (End-User Focused) * Best HPC Collaboration Between Government & Industry * Best HPC Collaboration Between Academia & Industry * Top Supercomputing Achievement * Top 5 New Products or Technologies to Watch * Top 5 Vendors to Watch * Workforce Diversity Leadership Award * Outstanding Leadership in HPC Nominations are accepted from readers, users, vendors - virtually anyone who is connected to the HPC community and is a reader of HPCwire. Nominations will close on August 12, 2016 at 11:59pm. Make your voice heard! Help tell the story of HPC in 2016 by submitting your nominations for the HPCwire Readers' Choice Awards now! Nominations close on August 12, 2016. All nominations are subject to review by the editors of HPCwire with only the most relevant being accepted. Voting begins August 22, 2015. The final presentation of these prestigious and highly anticipated awards to each organization's leading executives will take place live during SC '16 in Salt Lake City, UT. The finalist(s) in each category who receive the most votes will win this year's awards. Open to HPCwire readers only. HPCwire Subscriber Services This email was sent to lwestoby at us.ibm.com. You are receiving this email message as an HPCwire subscriber. To forward this email to a friend, click here. Unsubscribe from this list. Copyright ? 2016 Tabor Communications Inc. All rights reserved. 8445 Camino Santa Fe San Diego, California 92121 P: 858.625.0070 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 40078 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 5880 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Aug 15 10:59:34 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 09:59:34 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they're on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 12:22:31 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 11:22:31 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Aug 15 13:45:25 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 15 Aug 2016 12:45:25 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 15 13:58:47 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 12:58:47 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Aug 15 14:02:13 2016 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 15 Aug 2016 13:02:13 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: , <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 14:05:01 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 13:05:01 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: <28479088-C492-4441-A761-F49E1556E13E@nuance.com> PTF = Program Temporary Fix. IBM-Speak for a fix for a particular problem. Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 7:58 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kdball at us.ibm.com Mon Aug 15 15:12:07 2016 From: kdball at us.ibm.com (Keith D Ball) Date: Mon, 15 Aug 2016 14:12:07 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 55, Issue 16 In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jake.carroll at uq.edu.au Mon Aug 15 22:08:58 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Mon, 15 Aug 2016 21:08:58 +0000 Subject: [gpfsug-discuss] More on AFM cache chaining Message-ID: <94AB3BCD-B551-4F3E-9128-65B582A4ABC6@uq.edu.au> Hi there. In the spirit of a conversation a friend showed me a couple of weeks ago from Radhika Parameswaran and Luke Raimbach, we?re doing something similar to Luke (kind of), or at least attempting it, in regards to cache chaining. We?ve got a large research storage platform in Brisbane, Queensland, Australia and we?re trying to leverage a few different modes of operation. Currently: Cache A (IW) connects to what would be a Home (B) which then is effectively an NFS mount to (C) a DMF based NFS export. To a point, this works. It kind of allows us to use ?home? as the ultimate sink, and data migration in and out of DMF seems to be working nicely when GPFS pulls things from (B) which don?t appear to currently be in (A) due to policy, or a HWM was hit (thus emptying cache). We?ve tested it as far out as the data ONLY being offline in tape media inside (C) and it still works, cleanly coming back to (A) within a very reasonable time-frame. ? We hit ?problem 1? which is in and around NFS v4 ACL?s which aren?t surfacing or mapping correctly (as we?d expect). I guess this might be the caveat of trying to backend the cache to a home and have it sitting inside DMF (over an NFS Export) for surfacing of the data for clients. Where we?d like to head: We haven?t seen it yet, but as Luke and Radhika were discussing last month, we really liked the idea of an IW Cache (A, where instruments dump huge data) which then via AFM ends up at (B) (might also be technically ?home? but IW) which is then also a function of (C) which might also be another cache that sits next to a HPC platform for reading and writing data into quickly and out of in parallel. We like the idea of chained caches because it gives us extremely flexibility in the premise of our ?Data anywhere? fabric. We appreciate that this has some challenges, in that we know if you?ve got multiple IW scenarios the last write will always win ? this we can control with workload guidelines. But we?d like to add our voices to this idea of having caches chained all the way back to some point such that data is being pulled all the way from C --> B --> A and along the way, inflection points of IO might be written and read at point C and point B AND point A such that everyone would see the distribution and consistent data in the end. We?re also working on surfacing data via object and file simultaneously for different needs. This is coming along relatively well, but we?re still learning about where and where this does not make sense so far. A moving target, from how it all appears on the surface. Some might say that is effectively asking for a globally eventually (always) consistent filesystem within Scale?. Anyway ? just some thoughts. Regards, -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 03:22:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 15 Aug 2016 22:22:17 -0400 Subject: [gpfsug-discuss] mmfsadm test pit Message-ID: I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 16 16:21:13 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:21:13 -0700 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Light Weight Event support is not fully baked yet, and thus not documented. It's getting there. yuri From: "Daniel Kidger" To: "gpfsug main discussion list" , Cc: "gpfsug-discuss" Date: 08/04/2016 01:23 AM Subject: Re: [gpfsug-discuss] 4.2.1 documentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From volobuev at us.ibm.com Tue Aug 16 16:42:33 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:42:33 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca><20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca><20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca><7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> Message-ID: This is a long discussion thread, touching on several related subjects, but as far as the original "secondary groups" question, things are quite simple. A file in a Unix file system has an owning user and an owning group. Those are two IDs that are stored in the inode on disk, and those IDs are used to charge the corresponding user and group quotas. Exactly how the owning GID gets set is an entirely separate question. It may be the current user's primary group, or a secondary group, or a result of chown, etc. To GPFS code it doesn't matter what supplementary GIDs a given thread has in its security context for the purposes of charging group quota, the only thing that matters is the GID in the file inode. yuri From: "Jaime Pinto" To: "gpfsug main discussion list" , "Buterbaugh, Kevin L" , Date: 08/04/2016 09:34 AM Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? Sent by: gpfsug-discuss-bounces at spectrumscale.org OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > < mailto:pinto at scinet.utoronto.ca>> > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca< http://www.scinet.utoronto.ca/> - > www.computecanada.org< http://www.computecanada.org/> > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Tue Aug 16 16:59:13 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 16 Aug 2016 15:59:13 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up Message-ID: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Tue Aug 16 17:13:17 2016 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Tue, 16 Aug 2016 16:13:17 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up In-Reply-To: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> References: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> Message-ID: <57c145ab-4207-7550-af57-ff07d6ac8f2d@mdanderson.org> I am speaking: SNP-2408 : Implementing a Research Storage Environment Using IBM Spectrum Software at MD Anderson Cancer Center Program : Enabling Cognitive IT with Storage and Software Defined Solutions Track : Building Oceans of Data Session Type : Breakout Session Date/Time : Tue, 20-Sep, 05:00 PM-06:00 PM Location : MGM Grand - Room 104 Presenter(s):Jonathan Fosburgh, UT MD Anderson This is primarily dealing with Scale and Archive, and also includes Protect. -- Jonathan Fosburgh Principal Application Systems Analyst Storage Team IT Operations jfosburg at mdanderson.org (713) 745-9346 On 08/16/2016 10:59 AM, Oesterlin, Robert wrote: For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Aug 16 22:09:35 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 16 Aug 2016 17:09:35 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: I was surprised to read that Ctrl-C did not really kill restripe. It's supposed to! If it doesn't that's a bug. I ran this by my expert within IBM and he wrote to me: First of all a "PIT job" such as restripe, deldisk, delsnapshot, and such should be easy to stop by ^C the management program that started them. The SG manager daemon holds open a socket to the client program for the purposes of sending command output, progress updates, error messages and the like. The PIT code checks this socket periodically and aborts the PIT process cleanly if the socket is closed. If this cleanup doesn't occur, it is a bug and should be worth reporting. However, there's no exact guarantee on how quickly each thread on the SG mgr will notice and then how quickly the helper nodes can be stopped and so forth. The interval between socket checks depends among other things on how long it takes to process each file, if there are a few very large files, the delay can be significant. In the limiting case, where most of the FS storage is contained in a few files, this mechanism doesn't work [elided] well. So it can be quite involved and slow sometimes to wrap up a PIT operation. The simplest way to determine if the command has really stopped is with the mmdiag --commands issued on the SG manager node. This shows running commands with the command line, start time, socket, flags, etc. After ^Cing the client program, the entry here should linger for a while, then go away. When it exits you'll see an entry in the GPFS log file where it fails with err 50. If this doesn't stop the command after a while, it is worth looking into. If the command wasn't issued on the SG mgr node and you can't find the where the client command is running, the socket is still a useful hint. While tedious, it should be possible to trace this socket back to node where that command was originally run using netstat or equivalent. Poking around inside a GPFS internaldump will also provide clues; there should be an outstanding sgmMsgSGClientCmd command listed in the dump tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or similar. I'd like to warn the OP away from mmfsadm test pit. These commands are of course unsupported and unrecommended for any purpose (even internal test and development purposes, as far as I know). You are definitely working without a net there. When I was improving the integration between PIT and snapshot quiesce a few years ago, I looked into this and couldn't figure out how to (easily) make these stop and resume commands safe to use, so as far as I know they remain unsafe. The list command, however, is probably fairly okay; but it would probably be better to use mmfsadm saferdump pit. From: Aaron Knister To: Date: 08/15/2016 10:49 PM Subject: [gpfsug-discuss] mmfsadm test pit Sent by: gpfsug-discuss-bounces at spectrumscale.org I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 22:55:19 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 16 Aug 2016 17:55:19 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: Thanks Marc! That's incredibly helpful info. I'll uh, not use the test pit command :) -Aaron On 8/16/16 5:09 PM, Marc A Kaplan wrote: > I was surprised to read that Ctrl-C did not really kill restripe. It's > supposed to! If it doesn't that's a bug. > > I ran this by my expert within IBM and he wrote to me: > > First of all a "PIT job" such as restripe, deldisk, delsnapshot, and > such should be easy to stop by ^C the management program that started > them. The SG manager daemon holds open a socket to the client program > for the purposes of sending command output, progress updates, error > messages and the like. The PIT code checks this socket periodically and > aborts the PIT process cleanly if the socket is closed. If this cleanup > doesn't occur, it is a bug and should be worth reporting. However, > there's no exact guarantee on how quickly each thread on the SG mgr will > notice and then how quickly the helper nodes can be stopped and so > forth. The interval between socket checks depends among other things on > how long it takes to process each file, if there are a few very large > files, the delay can be significant. In the limiting case, where most > of the FS storage is contained in a few files, this mechanism doesn't > work [elided] well. So it can be quite involved and slow sometimes to > wrap up a PIT operation. > > The simplest way to determine if the command has really stopped is with > the mmdiag --commands issued on the SG manager node. This shows running > commands with the command line, start time, socket, flags, etc. After > ^Cing the client program, the entry here should linger for a while, then > go away. When it exits you'll see an entry in the GPFS log file where > it fails with err 50. If this doesn't stop the command after a while, > it is worth looking into. > > If the command wasn't issued on the SG mgr node and you can't find the > where the client command is running, the socket is still a useful hint. > While tedious, it should be possible to trace this socket back to node > where that command was originally run using netstat or equivalent. > Poking around inside a GPFS internaldump will also provide clues; there > should be an outstanding sgmMsgSGClientCmd command listed in the dump > tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or > similar. > > I'd like to warn the OP away from mmfsadm test pit. These commands are > of course unsupported and unrecommended for any purpose (even internal > test and development purposes, as far as I know). You are definitely > working without a net there. When I was improving the integration > between PIT and snapshot quiesce a few years ago, I looked into this and > couldn't figure out how to (easily) make these stop and resume commands > safe to use, so as far as I know they remain unsafe. The list command, > however, is probably fairly okay; but it would probably be better to use > mmfsadm saferdump pit. > > > > > > From: Aaron Knister > To: > Date: 08/15/2016 10:49 PM > Subject: [gpfsug-discuss] mmfsadm test pit > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I just discovered this interesting gem poking at mmfsadm: > > test pit fsname list|suspend|status|resume|stop [jobId] > > There have been times where I've kicked off a restripe and either > intentionally or accidentally ctrl-c'd it only to realize that many > times it's disappeared into the ether and is still running. The only way > I've known so far to stop it is with a chgmgr. > > A far more painful instance happened when I ran a rebalance on an fs > w/more than 31 nsds using more than 31 pit workers and hit *that* fun > APAR which locked up access for a single filesystem to all 3.5k nodes. > We spent 48 hours round the clock rebooting nodes as jobs drained to > clear it up. I would have killed in that instance for a way to cancel > the PIT job (the chmgr trick didn't work). It looks like you might > actually be able to do this with mmfsadm, although how wise this is, I > do not know (kinda curious about that). > > Here's an example. I kicked off a restripe and then ctrl-c'd it on a > client node. Then ran these commands from the fs manager: > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 > debug: statusListP D40E2C70 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop > 785979015170 > debug: statusListP 0 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 > debug: statusListP D4013E70 > > ... some time passes ... > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > debug: statusListP 0 > > Interesting. > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Wed Aug 17 02:46:39 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 17 Aug 2016 01:46:39 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 17 12:45:04 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 17 Aug 2016 11:45:04 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Message-ID: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From volobuev at us.ibm.com Wed Aug 17 21:34:57 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Wed, 17 Aug 2016 13:34:57 -0700 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Message-ID: Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From SAnderson at convergeone.com Wed Aug 17 22:11:25 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Wed, 17 Aug 2016 21:11:25 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Message-ID: <1471468285737.63407@convergeone.com> ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 14134 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.jpg Type: image/jpeg Size: 2593 bytes Desc: image003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 11635 bytes Desc: image005.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.png Type: image/png Size: 11505 bytes Desc: image007.png URL: From YARD at il.ibm.com Thu Aug 18 00:11:52 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 02:11:52 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: <1471468285737.63407@convergeone.com> References: <1471468285737.63407@convergeone.com> Message-ID: Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From SAnderson at convergeone.com Thu Aug 18 02:51:38 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 01:51:38 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: References: <1471468285737.63407@convergeone.com>, Message-ID: <1471485097896.49269@convergeone.com> ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards ________________________________ Yaron Daniel 94 Em Ha'Moshavot Rd [cid:_1_0DDE2A700DDE24DC007F6D32C2258012] Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 1851 bytes Desc: ATT00001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00002.png Type: image/png Size: 14134 bytes Desc: ATT00002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00003.jpg Type: image/jpeg Size: 2593 bytes Desc: ATT00003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00004.png Type: image/png Size: 11635 bytes Desc: ATT00004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00005.png Type: image/png Size: 11505 bytes Desc: ATT00005.png URL: From YARD at il.ibm.com Thu Aug 18 04:56:50 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 06:56:50 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: So - the procedure you are asking related to Samba. Please check at redhat Site the process of upgrade Samba - u will need to backup the tdb files and restore them. But pay attention that the Samba ids will remain the same after moving to CES - please review the Authentication Section. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: gpfsug main discussion list Date: 08/18/2016 04:52 AM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Thu Aug 18 15:47:25 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 18 Aug 2016 14:47:25 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: on behalf of Yuri L Volobuev Reply-To: gpfsug main discussion list Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From bbanister at jumptrading.com Thu Aug 18 16:00:21 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 18 Aug 2016 15:00:21 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From mimarsh2 at vt.edu Thu Aug 18 16:15:38 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Thu, 18 Aug 2016 11:15:38 -0400 Subject: [gpfsug-discuss] NSD Server BIOS setting - snoop mode Message-ID: All, Is there any best practice or recommendation for the Snoop Mode memory setting for NSD Servers? Default is Early Snoop. On compute nodes, I am using Cluster On Die, which creates 2 NUMA nodes per processor. This setup has 2 x 16-core Broadwell processors in each NSD server. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmcpheeters at anl.gov Thu Aug 18 16:14:11 2016 From: gmcpheeters at anl.gov (McPheeters, Gordon) Date: Thu, 18 Aug 2016 15:14:11 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Got my vote - thanks Robert. Gordon McPheeters ALCF Storage (630) 252-6430 gmcpheeters at anl.gov On Aug 18, 2016, at 10:00 AM, Bryan Banister > wrote: Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 18:50:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 10:50:12 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From SAnderson at convergeone.com Thu Aug 18 19:11:02 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 18:11:02 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 18 20:05:03 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 18 Aug 2016 19:05:03 +0000 Subject: [gpfsug-discuss] Please ignore - debugging an issue Message-ID: Please ignore. I am working with the list admins on an issue and need to send an e-mail to the list to duplicate the problem. I apologize that this necessitates this e-mail to the list. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 20:43:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 12:43:50 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: There are a few points to consider here: CES uses Samba in cluster mode with ctdb. That means that the tdb database is shared through ctdb on all protocol nodes, and the internal format is slightly different since it contains additional information for tracking the cross-node status of the individual records. Spectrum Scale officially supports the autorid module for internal id mapping. That approach is different than the older idmap_tdb since it basically only has one record per AD domain, and not one record per user or group. This is known to scale better in environments where many users and groups require id mappings. The downside is that data from idmap_tdb cannot be directly imported. While not officially supported Spectrum Scale also ships the idmap_tdb module. You could configure authentication and internal id mapping on Spectrum Scale, and then overwrite the config manually to use the old idmap module (the idmap-range-size is required, but not relevant later on): mmuserauth service create ... --idmap-range 1000000-9000000 --idmap-range-size 100000 /usr/lpp/mmfs/bin/net conf setparm global 'idmap config * : backend' tdb mmdsh -N CesNodes systemctl restart gpfs-winbind mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush With the old Samba, export the idmap data to a file: net idmap dump > idmap-dump.txt And on a node running CES Samba import that data, and remove any old cached entries: /usr/lpp/mmfs/bin/net idmap restore idmap-dump.txt mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush Just to be clear: This is untested and if there is a problem with the id mapping in that configuration, it will likely be pointed to the unsupported configuration. The way to request this as an official feature would be through a RFE, although i cannot say whether that would be picked up by product management. Another option would be creating the id mappings in the Active Directory records or in a external LDAP server based on the old mappings, and point the CES Samba to that data. That would again be a supported configuration. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: Christof Schmitt/Tucson/IBM at IBMUS Cc: gpfsug main discussion list Date: 08/18/2016 11:11 AM Subject: RE: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From jez.tucker at gpfsug.org Thu Aug 18 20:57:00 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Thu, 18 Aug 2016 20:57:00 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces Message-ID: Hi all As the discussion group is a mailing list, it is possible that members can experience the list traffic being interpreted as spam. In such instances, you may experience better results if you whitelist the mailing list addresses or create a 'Not Spam' filter (E.G. gmail) gpfsug-discuss at spectrumscale.org gpfsug-discuss at gpfsug.org You can test that you can receive a response from the mailing list server by sending an email to: gpfsug-discuss-request at spectrumscale.org with the subject of: help Should you experience further trouble, please ping us at: gpfsug-discuss-owner at spectrumscale.org All the best, Jez From aaron.s.knister at nasa.gov Fri Aug 19 05:12:26 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:12:26 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 gpfs server nodes and we've done several rounds of rolling upgrades starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of both versions for quite some time (We're actually in that state right now as it would happen and have been for several months). I've not seen any issue with it. Of course, as Richard alluded to, its good to check the release notes :) -Aaron On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: > Richard, > > I will second what Bob said with one caveat ? on one occasion we had an > issue with our multi-cluster setup because the PTF?s were incompatible. > However, that was clearly documented in the release notes, which we > obviously hadn?t read carefully enough. > > While we generally do rolling upgrades over a two to three week period, > we have run for months with clients at differing PTF levels. HTHAL? > > Kevin > >> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >> > wrote: >> >> In general, yes, it's common practice to do the 'rolling upgrades'. If >> I had to do my whole cluster at once, with an outage, I'd probably >> never upgrade. :) >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> *From: *> > on behalf of >> "Sobey, Richard A" > > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Monday, August 15, 2016 at 4:59 AM >> *To: *"'gpfsug-discuss at spectrumscale.org >> '" >> > > >> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >> problems? >> >> Hi all, >> >> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >> it over a few days, perhaps up to 2 weeks or will I run into problems >> if they?re on different versions? >> >> Cheers >> >> Richard >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu > - (615)875-9633 > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Fri Aug 19 05:13:06 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:13:06 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Message-ID: <70e33e6d-cd6b-5a5e-1e2d-f0ad16def5f4@nasa.gov> Oops... I meant Kevin, not Richard. On 8/19/16 12:12 AM, Aaron Knister wrote: > Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 > gpfs server nodes and we've done several rounds of rolling upgrades > starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of > both versions for quite some time (We're actually in that state right > now as it would happen and have been for several months). I've not seen > any issue with it. Of course, as Richard alluded to, its good to check > the release notes :) > > -Aaron > > On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: >> Richard, >> >> I will second what Bob said with one caveat ? on one occasion we had an >> issue with our multi-cluster setup because the PTF?s were incompatible. >> However, that was clearly documented in the release notes, which we >> obviously hadn?t read carefully enough. >> >> While we generally do rolling upgrades over a two to three week period, >> we have run for months with clients at differing PTF levels. HTHAL? >> >> Kevin >> >>> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >>> > >>> wrote: >>> >>> In general, yes, it's common practice to do the 'rolling upgrades'. If >>> I had to do my whole cluster at once, with an outage, I'd probably >>> never upgrade. :) >>> >>> >>> Bob Oesterlin >>> Sr Storage Engineer, Nuance HPC Grid >>> >>> >>> *From: *>> > on behalf of >>> "Sobey, Richard A" >> > >>> *Reply-To: *gpfsug main discussion list >>> >> > >>> *Date: *Monday, August 15, 2016 at 4:59 AM >>> *To: *"'gpfsug-discuss at spectrumscale.org >>> '" >>> >> > >>> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >>> problems? >>> >>> Hi all, >>> >>> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >>> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >>> it over a few days, perhaps up to 2 weeks or will I run into problems >>> if they?re on different versions? >>> >>> Cheers >>> >>> Richard >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ? >> Kevin Buterbaugh - Senior System Administrator >> Vanderbilt University - Advanced Computing Center for Research and >> Education >> Kevin.Buterbaugh at vanderbilt.edu >> - (615)875-9633 >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bdeluca at gmail.com Fri Aug 19 05:15:00 2016 From: bdeluca at gmail.com (Ben De Luca) Date: Fri, 19 Aug 2016 07:15:00 +0300 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: Hey Jez, Its because the mailing list doesn't have an SPF record in your DNS, being neutral is a good way to be picked up as spam. On 18 August 2016 at 22:57, Jez Tucker wrote: > Hi all > > As the discussion group is a mailing list, it is possible that members can > experience the list traffic being interpreted as spam. > > > In such instances, you may experience better results if you whitelist the > mailing list addresses or create a 'Not Spam' filter (E.G. gmail) > > gpfsug-discuss at spectrumscale.org > > gpfsug-discuss at gpfsug.org > > > You can test that you can receive a response from the mailing list server by > sending an email to: gpfsug-discuss-request at spectrumscale.org with the > subject of: help > > > Should you experience further trouble, please ping us at: > gpfsug-discuss-owner at spectrumscale.org > > > All the best, > > > Jez > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jez.tucker at gpfsug.org Fri Aug 19 08:51:20 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Fri, 19 Aug 2016 08:51:20 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: <0c9d81b2-ac41-b6a5-e4f1-a816558711b7@gpfsug.org> Hi Yes, we looked at that some time ago and I recall we had an issues with setting up the SPF. However, probably a good time as any to look at it again. I'll ping Arif and Simon and they can look at their respective domains. Jez On 19/08/16 05:15, Ben De Luca wrote: > Hey Jez, > Its because the mailing list doesn't have an SPF record in your > DNS, being neutral is a good way to be picked up as spam. > > > > On 18 August 2016 at 22:57, Jez Tucker wrote: >> Hi all >> >> As the discussion group is a mailing list, it is possible that members can >> experience the list traffic being interpreted as spam. >> >> >> In such instances, you may experience better results if you whitelist the >> mailing list addresses or create a 'Not Spam' filter (E.G. gmail) >> >> gpfsug-discuss at spectrumscale.org >> >> gpfsug-discuss at gpfsug.org >> >> >> You can test that you can receive a response from the mailing list server by >> sending an email to: gpfsug-discuss-request at spectrumscale.org with the >> subject of: help >> >> >> Should you experience further trouble, please ping us at: >> gpfsug-discuss-owner at spectrumscale.org >> >> >> All the best, >> >> >> Jez >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From aaron.s.knister at nasa.gov Fri Aug 19 23:06:57 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 18:06:57 -0400 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Message-ID: <5ca238de-bb95-2854-68bd-36d1b8df2810@nasa.gov> Thanks everyone! I also have a PMR open for this, so hopefully the RFE gets some traction. On 8/18/16 11:14 AM, McPheeters, Gordon wrote: > Got my vote - thanks Robert. > > > Gordon McPheeters > ALCF Storage > (630) 252-6430 > gmcpheeters at anl.gov > > > >> On Aug 18, 2016, at 10:00 AM, Bryan Banister >> > wrote: >> >> Great stuff? I added my vote, >> -Bryan >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] *On >> Behalf Of *Oesterlin, Robert >> *Sent:* Thursday, August 18, 2016 9:47 AM >> *To:* gpfsug main discussion list >> *Subject:* Re: [gpfsug-discuss] Monitor NSD server queue? >> >> Done. >> >> Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) >> >> ID: 93260 >> Headline: Give sysadmin insight >> into the inner workings of the NSD server machinery, in particular the >> queue dynamics >> Submitted on: 18 Aug 2016, 10:46 AM Eastern >> Time (ET) >> Brand: Servers and Systems >> Software >> Product: Spectrum Scale (formerly >> known as GPFS) - Public RFEs >> >> Link: >> http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> 507-269-0413 >> >> >> *From: *> > on behalf of Yuri L >> Volobuev > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Wednesday, August 17, 2016 at 3:34 PM >> *To: *gpfsug main discussion list > > >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? >> >> >> Unfortunately, at the moment there's no safe mechanism to show the >> usage statistics for different NSD queues. "mmfsadm saferdump nsd" as >> implemented doesn't acquire locks when parsing internal data >> structures. Now, NSD data structures are fairly static, as much things >> go, so the risk of following a stale pointer and hitting a segfault >> isn't particularly significant. I don't think I remember ever seeing >> mmfsd crash with NSD dump code on the stack. That said, this isn't >> code that's tested and known to be safe for production use. I haven't >> seen a case myself where an mmfsd thread gets stuck running this dump >> command, either, but Bob has. If that condition ever reoccurs, I'd be >> interested in seeing debug data. >> >> I agree that there's value in giving a sysadmin insight into the inner >> workings of the NSD server machinery, in particular the queue >> dynamics. mmdiag should be enhanced to allow this. That'd be a very >> reasonable (and doable) RFE. >> >> yuri >> >> "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron >> You did a perfect job of explaining a situation I've run into time >> after time - high latenc >> >> From: "Oesterlin, Robert" > > >> To: gpfsug main discussion list > >, >> Date: 08/17/2016 04:45 AM >> Subject: Re: [gpfsug-discuss] Monitor NSD server queue? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> Hi Aaron >> >> You did a perfect job of explaining a situation I've run into time >> after time - high latency on the disk subsystem causing a backup in >> the NSD queues. I was doing what you suggested not to do - "mmfsadm >> saferdump nsd' and looking at the queues. In my case 'mmfsadm >> saferdump" would usually work or hang, rather than kill mmfsd. But - >> the hang usually resulted it a tied up thread in mmfsd, so that's no >> good either. >> >> I wish I had better news - this is the only way I've found to get >> visibility to these queues. IBM hasn't seen fit to gives us a way to >> safely look at these. I personally think it's a bug that we can't >> safely dump these structures, as they give insight as to what's >> actually going on inside the NSD server. >> >> Yuri, Sven - thoughts? >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> >> *From: *> > on behalf of >> "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" >> >* >> Reply-To: *gpfsug main discussion list >> > >* >> Date: *Tuesday, August 16, 2016 at 8:46 PM* >> To: *gpfsug main discussion list > >* >> Subject: *[EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? >> >> Hi Everyone, >> >> We ran into a rather interesting situation over the past week. We had >> a job that was pounding the ever loving crap out of one of our >> filesystems (called dnb02) doing about 15GB/s of reads. We had other >> jobs experience a slowdown on a different filesystem (called dnb41) >> that uses entirely separate backend storage. What I can't figure out >> is why this other filesystem was affected. I've checked IB bandwidth >> and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth >> congestion, looked at the mmpmon nsd_ds counters (including disk >> request wait time), and checked out the disk iowait values from >> collectl. I simply can't account for the slowdown on the other >> filesystem. The only thing I can think of is the high latency on >> dnb02's NSDs caused the mmfsd NSD queues to back up. >> >> Here's my question-- how can I monitor the state of th NSD queues? I >> can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the >> queues and their status. I'm just not sure calling saferdump NSD every >> 10 seconds to monitor this data is going to end well. I've seen >> saferdump NSD cause mmfsd to die and that's from a task we only run >> every 6 hours that calls saferdump NSD. >> >> Any thoughts/ideas here would be great. >> >> Thanks! >> >> -Aaron_______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> ------------------------------------------------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged >> information. If you are not the intended recipient, you are hereby >> notified that any review, dissemination or copying of this email is >> strictly prohibited, and to please notify the sender immediately and >> destroy this email and any attachments. Email transmission cannot be >> guaranteed to be secure or error-free. The Company, therefore, does >> not make any guarantees as to the completeness or accuracy of this >> email or any attachments. This email is for informational purposes >> only and does not constitute a recommendation, offer, request or >> solicitation of any kind to buy, sell, subscribe, redeem or perform >> any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From r.sobey at imperial.ac.uk Mon Aug 22 12:59:16 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 11:59:16 +0000 Subject: [gpfsug-discuss] CES and mmuserauth command Message-ID: Hi all, We're just about to start testing a new CES 4.2.0 cluster and at the stage of "joining" the cluster to our AD. What's the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I've re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I'm using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 22 14:28:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 13:28:01 +0000 Subject: [gpfsug-discuss] CES mmsmb options Message-ID: Related to my previous question in so far as it's to do with CES, what's this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static... for example log size / location / dmapi support? I'm surely missing something obvious. It's SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Tue Aug 23 00:30:10 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 22 Aug 2016 16:30:10 -0700 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Looks like there is a per export and a global listing. These are values that can be set per export : /usr/lpp/mmfs/bin/mmsmb export change --key-info supported Supported smb options with allowed values: admin users = any // any valid user browseable = yes, no comment = any // A free text description of the export. csc policy = manual, disable, documents, programs fileid:algorithm = fsname, hostname, fsname_nodirs, fsname_norootdir gpfs:leases = yes, no gpfs:recalls = yes, no gpfs:sharemodes = yes, no gpfs:syncio = yes, no hide unreadable = yes, no oplocks = yes, no posix locking = yes, no read only = yes, no smb encrypt = auto, default, mandatory, disabled syncops:onclose = yes, no These are the values that are set globally: /usr/lpp/mmfs/bin/mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 23 03:23:40 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Mon, 22 Aug 2016 22:23:40 -0400 Subject: [gpfsug-discuss] GPFS FPO Message-ID: Does anyone have any experiences to share (good or bad) about setting up and utilizing FPO for hadoop compute on top of GPFS? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 23 03:37:00 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 22 Aug 2016 22:37:00 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Yes, indeed. Note that these are my personal opinions. It seems to work quite well and it's not terribly hard to set up or get running. That said, if you've got a traditional HPC cluster with reasonably good bandwidth (and especially if your data is already on the HPC cluster) I wouldn't bother with FPO and just use something like magpie (https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on your traditional HPC cluster. I believe FPO (and by extension data locality) is important when the available bandwidth between your clients and servers/disks (in a traditional GPFS environment) is less than the bandwidth available within a node (e.g. between your local disks and the host CPU). -Aaron On 8/22/16 10:23 PM, Brian Marshall wrote: > Does anyone have any experiences to share (good or bad) about setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From mimarsh2 at vt.edu Tue Aug 23 12:56:22 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 23 Aug 2016 07:56:22 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Aaron, Do you have experience running this on native GPFS? The docs say Lustre and any NFS filesystem. Thanks, Brian On Aug 22, 2016 10:37 PM, "Aaron Knister" wrote: > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or get > running. That said, if you've got a traditional HPC cluster with reasonably > good bandwidth (and especially if your data is already on the HPC cluster) > I wouldn't bother with FPO and just use something like magpie ( > https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on > your traditional HPC cluster. I believe FPO (and by extension data > locality) is important when the available bandwidth between your clients > and servers/disks (in a traditional GPFS environment) is less than the > bandwidth available within a node (e.g. between your local disks and the > host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > >> Does anyone have any experiences to share (good or bad) about setting up >> and utilizing FPO for hadoop compute on top of GPFS? >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 23 13:15:24 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 23 Aug 2016 14:15:24 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: > Hi all, > > > > We?re just about to start testing a new CES 4.2.0 cluster and at the stage > of ?joining? the cluster to our AD. What?s the bare minimum we need to get > going with this? My Windows guy (who is more Linux but whatever) has > suggested the following: > > > > mmuserauth service create --type ad --data-access-method file > > --netbios-name store --user-name USERNAME --password > > --enable-nfs-kerberos --enable-kerberos > > --servers list,of,servers > > --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 > --unixmap-domains 'DOMAIN(500 - 2000000)' > > > > He has also asked what the following is: > > > > --idmap-role ??? > > --idmap-range-size ?? > > > > All our LDAP GID/UIDs are coming from a system outside of GPFS so do we > leave this blank, or say master Or, now I?ve re-read and mmuserauth page, > is this purely for when you have AFM relationships and one GPFS cluster > (the subordinate / the second cluster) gets its UIDs and GIDs from another > GPFS cluster (the master / the first one)? > > > > For idmap-range-size is this essentially the highest number of users and > groups you can have defined within Spectrum Scale? (I love how I?m using > GPFS and SS interchangeably.. forgive me!) > > > > Many thanks > > > > Richard > > > > > > Richard Sobey > > Storage Area Network (SAN) Analyst > Technical Operations, ICT > Imperial College London > South Kensington > 403, City & Guilds Building > London SW7 2AZ > Tel: +44 (0)20 7594 6915 > Email: r.sobey at imperial.ac.uk > http://www.imperial.ac.uk/admin-services/ict/ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Aug 23 14:58:17 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 13:58:17 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: In one of my file systems, I have some odd entries that seem to not be associated with a user - any ideas on the cause or how to track these down? This is a snippet from mmprepquota: Block Limits | File Limits Name type KB quota limit in_doubt grace | files quota limit in_doubt grace 2751555824 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348898617 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348895209 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 1610682073 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 536964752 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 403325529 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:06:50 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:06:50 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: References: Message-ID: <1471961210.30100.88.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 13:58 +0000, Oesterlin, Robert wrote: > In one of my file systems, I have some odd entries that seem to not be > associated with a user - any ideas on the cause or how to track these > down? This is a snippet from mmprepquota: > > > > Block Limits > | File Limits > > Name type KB quota limit in_doubt > grace | files quota limit in_doubt grace > > 2751555824 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348898617 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348895209 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 1610682073 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 536964752 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 403325529 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Robert.Oesterlin at nuance.com Tue Aug 23 15:10:22 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 14:10:22 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Well - good idea, but these large numbers in no way reflect valid ID numbers in our environment. Wondering how they got there? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of Jonathan Buzzard Reply-To: gpfsug main discussion list Date: Tuesday, August 23, 2016 at 9:06 AM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] Re: [gpfsug-discuss] Odd entries in quota listing I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:16:05 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:16:05 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> References: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Message-ID: <1471961765.30100.90.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 14:10 +0000, Oesterlin, Robert wrote: > Well - good idea, but these large numbers in no way reflect valid ID > numbers in our environment. Wondering how they got there? > I was guessing generating UID's from Windows RID's? Alternatively some script generated them automatically and the UID's are bogus. You can create a quota for any random UID and GPFS won't complain. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From aaron.s.knister at nasa.gov Wed Aug 24 17:43:56 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Wed, 24 Aug 2016 12:43:56 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: <6f5a7284-c910-bbda-5e53-7f78e4289ad9@nasa.gov> To tell you the truth, I don't. It's on my radar but I haven't done it yet. I *have* run hadoop on GPFS w/o magpie though and on only a couple of nodes was able to pound 1GB/s out to GPFS w/ the terasort benchmark. I know our GPFS FS can go much faster than that but java was cpu-bound as it often seems to be. -Aaron On 8/23/16 7:56 AM, Brian Marshall wrote: > Aaron, > > Do you have experience running this on native GPFS? The docs say Lustre > and any NFS filesystem. > > Thanks, > Brian > > > On Aug 22, 2016 10:37 PM, "Aaron Knister" > wrote: > > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or > get running. That said, if you've got a traditional HPC cluster with > reasonably good bandwidth (and especially if your data is already on > the HPC cluster) I wouldn't bother with FPO and just use something > like magpie (https://github.com/LLNL/magpie > ) to run your hadoopy workload on > GPFS on your traditional HPC cluster. I believe FPO (and by > extension data locality) is important when the available bandwidth > between your clients and servers/disks (in a traditional GPFS > environment) is less than the bandwidth available within a node > (e.g. between your local disks and the host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > > Does anyone have any experiences to share (good or bad) about > setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From SAnderson at convergeone.com Thu Aug 25 17:32:48 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 25 Aug 2016 16:32:48 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752@convergeone.com> ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:47:00 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:47:00 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:50:20 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:50:20 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BFD5@CHI-EXCHANGEW1.w2k.jumptrading.com> I realize this was totally tangential to your question. Sorry I can?t help with the syntax, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Thursday, August 25, 2016 11:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Thu Aug 25 17:55:44 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Thu, 25 Aug 2016 09:55:44 -0700 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: Not sure where mmcessmbchconfig command is coming from? mmsmb is the proper CLI syntax [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb Usage: mmsmb export Administer SMB exports. mmsmb exportacl Administer SMB export ACLs. mmsmb config Administer SMB global configuration. [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb export -h Usage: mmsmb export list List SMB exports. mmsmb export add Add SMB exports. mmsmb export change Change SMB exports. mmsmb export remove Remove SMB exports. [root at smaug-vm1 installer]# man mmsmb http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_mmsmb.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweil at wustl.edu Thu Aug 25 19:50:52 2016 From: mweil at wustl.edu (Matt Weil) Date: Thu, 25 Aug 2016 13:50:52 -0500 Subject: [gpfsug-discuss] Backup on object stores Message-ID: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From billowen at us.ibm.com Thu Aug 25 20:55:33 2016 From: billowen at us.ibm.com (Bill Owen) Date: Thu, 25 Aug 2016 12:55:33 -0700 Subject: [gpfsug-discuss] Backup on object stores In-Reply-To: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> References: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Message-ID: Hi Matt, With Spectrum Scale object storage, you can create storage policies, and then assign containers to those policies. Each policy will map to a GPFS independent fileset. That way, you can subdivide object storage and manage different types of objects based on the type of data stored in the container/storage policy (i.e., back up some types of object data nightly, some weekly, some not at all). Today, we don't have a cli to simplify to restoring individual objects. But using commands like swift-get-nodes, you can determine the filesystem path to an object, and then restore only that item. And if you are using storage policies with file & object access enabled, you can access the object/files by file path directly. Regards, Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 From: Matt Weil To: Date: 08/25/2016 11:51 AM Subject: [gpfsug-discuss] Backup on object stores Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Fri Aug 26 00:14:57 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 25 Aug 2016 23:14:57 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <156b078bfb2d48d8b77d5250dba7e928@exch1-cdc.nexus.csiro.au> I agree with an RFE. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Friday, 26 August 2016 2:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From syi at ca.ibm.com Fri Aug 26 00:15:46 2016 From: syi at ca.ibm.com (Yi Sun) Date: Thu, 25 Aug 2016 19:15:46 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: You may check mmsmb command, not sure if it is what you look for. https://www.ibm.com/support/knowledgecenter/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.adm.doc/bl1adm_mmsmb.htm#mmsmb ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- From: Shaun Anderson To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752 at convergeone.com> Content-Type: text/plain; charset="iso-8859-1" ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: To clarify and expand on some of these: --servers takes the AD Domain Controller that is contacted first during configuration. Later and during normal operations the list of DCs is retrieved from DNS and the fastest (or closest one according to the AD sites) is used. The initially one used does not have a special role. --idmap-role allows dedicating one cluster as a master, and a second cluster (e.g. a AFM replication target) as "subordinate". Only the master will allocate idmap ranges which can then be imported to the subordiate to have consistent id mappings. --idmap-range-size and --idmap-range are used for the internal idmap allocation which is used for every domain that is not explicitly using another domain. "man idmap_autorid" explains the approach taken. As long as the default does not overlap with any other ids, that can be used. The "netbios" name is used to create the machine account for the cluster when joining the AD domain. That is how the AD administrator will identify the CES cluster. It is also important in SMB deployments when Kerberos should be used with SMB: The same names as the netbios name has to be defined in DNS for the public CES IP addresses. When the name matches, then SMB clients can acquire a Kerberos ticket from AD to establish a SMB connection. When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/23/2016 08:15 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: Hi all, We?re just about to start testing a new CES 4.2.0 cluster and at the stage of ?joining? the cluster to our AD. What?s the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I?ve re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I?m using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: The mmcessmb* commands are scripts that are run from the corresponding mmsmb subcommands. mmsmb is documented and should be used instead of calling the mmcesmb* scripts directly. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/25/2016 12:33 PM Subject: [gpfsug-discuss] mmcessmbchconfig command Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:52:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:52:50 -0400 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From gaurang.tapase at in.ibm.com Fri Aug 26 08:53:12 2016 From: gaurang.tapase at in.ibm.com (Gaurang Tapase) Date: Fri, 26 Aug 2016 13:23:12 +0530 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Message-ID: Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.org or the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series , Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers , IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning , Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object , Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency , Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Fri Aug 26 09:17:55 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:17:55 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Thanks Christof, and for the detailed posting on the mmuserauth settings. I do not know why we have changed dmapi support in our existing smb.conf, but perhaps it was for some legacy stuff. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: 26 August 2016 00:53 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES mmsmb options The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Fri Aug 26 09:48:24 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:48:24 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: Sorry all, prepare for a deluge of emails like this, hopefully it'll help other people implementing CES in the future. I'm trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it's not running but it seems to be blocking me. It's happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Aug 26 10:48:18 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 09:48:18 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That was a weird one :-) Don't understand why NFS would block smb.., and I don't see that on my cluster. Would it make sense to suspend the node instead? As a workaround. mmces node suspend -jf fre. 26. aug. 2016 kl. 10.48 skrev Sobey, Richard A : > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > > > I?m trying to stop SMB on a node, but getting the following output: > > > > [root at cesnode ~]# mmces service stop smb > > smb: Request denied. Please stop NFS first > > > > [root at cesnode ~]# mmces service list > > Enabled services: SMB > > SMB is running > > > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From konstantin.arnold at unibas.ch Fri Aug 26 10:56:28 2016 From: konstantin.arnold at unibas.ch (Konstantin Arnold) Date: Fri, 26 Aug 2016 11:56:28 +0200 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: <57C0124C.7050404@unibas.ch> Hi Richard, I ran into the same issue and asked if 'systemctl reload gpfs-smb.service' would work? I got the following answer: "... Now in regards to your question about stopping NFS, yes this is an expected behavior and yes you could also restart through systemctl." Maybe that helps. Konstantin From janfrode at tanso.net Fri Aug 26 10:59:34 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 11:59:34 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From dacalder at co.ibm.com Sat Aug 27 13:52:44 2016 From: dacalder at co.ibm.com (Danny Alexander Calderon Rodriguez) Date: Sat, 27 Aug 2016 12:52:44 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: Message-ID: Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone > El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: > > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) > 2. Re: CES and mmuserauth command (Christof Schmitt) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Message-ID: > > > Content-Type: text/plain; charset="UTF-8" > > That would be the case when Active Directory is configured for > authentication. In that case the SMB service includes two aspects: One is > the actual SMB file server, and the second one is the service for the > Active Directory integration. Since NFS depends on authentication and id > mapping services, it requires SMB to be running. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > > Date: 08/26/2016 04:48 AM > Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > I?m trying to stop SMB on a node, but getting the following output: > > [root at cesnode ~]# mmces service stop smb > smb: Request denied. Please stop NFS first > > [root at cesnode ~]# mmces service list > Enabled services: SMB > SMB is running > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > ------------------------------ > > Message: 2 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Message-ID: > > > Content-Type: text/plain; charset="ISO-2022-JP" > > The --user-name option applies to both, AD and LDAP authentication. In the > LDAP case, this information is correct. I will try to get some > clarification added for the AD case. > > The same applies to the information shown in "service list". There is a > common field that holds the information and the parameter from the initial > "service create" is stored there. The meaning is different for AD and > LDAP: For LDAP it is the username being used to access the LDAP server, > while in the AD case it was only the user initially used until the machine > account was created. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < > christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > > That was my initial understanding of the --user-name, but when reading the > man-page I get the impression that it's also used to do connect to AD to > do user and group lookups: > > ------------------------------------------------------------------------------------------------------ > ??user?name userName > Specifies the user name to be used to perform operations > against the authentication server. The specified user > name must have sufficient permissions to read user and > group attributes from the authentication server. > ------------------------------------------------------------------------------------------------------- > > Also it's strange that "mmuserauth service list" would list the USER_NAME > if it was only somthing that was used at configuration time..? > > > > -jf_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 55, Issue 44 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Sat Aug 27 20:06:45 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Sat, 27 Aug 2016 19:06:45 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: Hi, Thanks for the info! I think I?ll perform an upgrade to 4.2.1, the cluster is still in a pre-production state and I?ve yet to really start testing client access. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Danny Alexander Calderon Rodriguez Sent: 27 August 2016 13:53 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) 2. Re: CES and mmuserauth command (Christof Schmitt) ---------------------------------------------------------------------- Message: 1 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: > Content-Type: text/plain; charset="UTF-8" That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ Message: 2 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command Message-ID: > Content-Type: text/plain; charset="ISO-2022-JP" The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 44 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Mon Aug 29 00:57:21 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 28 Aug 2016 23:57:21 +0000 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <57496841ec784222b5e291a921280c38@exch1-cdc.nexus.csiro.au> It would be nice if the Spectrum Scale User Group website had links to these, perhaps a separate page for blogs links. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Gaurang Tapase Sent: Friday, 26 August 2016 5:53 PM To: gpfsug main discussion list Cc: Sandeep Ramesh Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.orgor the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series, Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers, IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning, Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object, Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency, Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Mon Aug 29 06:34:03 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Sun, 28 Aug 2016 22:34:03 -0700 Subject: [gpfsug-discuss] Edge Attendees Message-ID: Greetings: I am organizing an NDA round-table with the IBM Offering Managers at IBM Edge on Tuesday, September 20th at 1pm. The subject will be "The Future of IBM Spectrum Scale." IBM Offering Managers are the Product Owners at IBM. There will be discussions covering licensing, the roadmap for IBM Spectrum Scale RAID (aka GNR), new hardware platforms, etc. This is a unique opportunity to get feedback to the drivers of the IBM Spectrum Scale business plans. It should be a great companion to the content we get from Engineering and Research at most User Group meetings. To get an invitation, please email me privately at douglasof us.ibm.com. All who have a valid NDA are invited. I only need an approximate headcount of attendees. Try not to spam the mailing list. I am pushing to get the Offering Managers to have a similar session at SC16 as an IBM Multi-client Briefing. You can add your voice to that call on this thread, or email me directly. Spectrum Scale User Group at SC16 will once again take place on Sunday afternoon with cocktails to follow. I hope we can blow out the attendance numbers and the number of site speakers we had last year! I know Simon, Bob, and Kristy are already working the agenda. Get your ideas in to them or to me. See you in Vegas, Vegas, SLC, Vegas this Fall... Maybe Australia in between? doug Douglas O'Flaherty IBM Spectrum Storage Marketing -------------- next part -------------- An HTML attachment was scrubbed... URL: From stef.coene at docum.org Mon Aug 29 07:39:05 2016 From: stef.coene at docum.org (Stef Coene) Date: Mon, 29 Aug 2016 08:39:05 +0200 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <9bb8d52e-86a3-3ff7-daaf-dc6bf0a3bd82@docum.org> Hi, When trying to register on the website, I each time get the error: "Session expired. Please try again later." Stef From kraemerf at de.ibm.com Mon Aug 29 08:20:46 2016 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Mon, 29 Aug 2016 09:20:46 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Hi all, In the last months several customers were asking for the option to use multiple IBM Spectrum Protect servers to protect a single IBM Spectrum Scale file system. Some of these customer reached the server scalability limits, others wanted to increase the parallelism of the server housekeeping processes. In consideration of the significant grow of data it can be assumed that more and more customers will be faced with this challenge in the future. Therefore, this paper was written that helps to address this situation. This paper describes the setup and configuration of multiple IBM Spectrum Protect servers to be used to store backup and hsm data of a single IBM Spectrum Scale file system. Beside the setup and configuration several best practices were written to the paper that help to simplify the daily use and administration of such environments. Find the paper here: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection A big THANK YOU goes to my co-writers Thomas Schreiber and Patrick Luft for their important input and all the tests (...and re-tests and re-tests and re-tests :-) ) they did. ...please share in your communities. Greetings, Dominic. ______________________________________________________________________________________________________________ Dominic Mueller-Wicke | IBM Spectrum Protect Development | Technical Lead | +49 7034 64 32794 | dominic.mueller at de.ibm.com Vorsitzende des Aufsichtsrats: Martina Koederitz; Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen; Registergericht: Amtsgericht Stuttgart, HRB 243294 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 18:33:59 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:33:59 -0400 Subject: [gpfsug-discuss] iowait? Message-ID: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Hi Everyone, Would it be easy to have GPFS report iowait values in linux? This would be a huge help for us in determining whether a node's low utilization is due to some issue with the code running on it or if it's blocked on I/O, especially in a historical context. I naively tried on a test system changing schedule() in cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: again: /* call the scheduler */ if ( waitFlags & INTERRUPTIBLE ) schedule(); else io_schedule(); Seems to actually do what I'm after but generally bad things happen when I start pretending I'm a kernel developer. Any thoughts? If I open an RFE would this be something that's relatively easy to implement (not asking for a commitment *to* implement it, just that I'm not asking for something seemingly simple that's actually fairly hard to implement)? -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From chekh at stanford.edu Mon Aug 29 18:50:23 2016 From: chekh at stanford.edu (Alex Chekholko) Date: Mon, 29 Aug 2016 10:50:23 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Message-ID: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Any reason you can't just use iostat or collectl or any of a number of other standards tools to look at disk utilization? On 08/29/2016 10:33 AM, Aaron Knister wrote: > Hi Everyone, > > Would it be easy to have GPFS report iowait values in linux? This would > be a huge help for us in determining whether a node's low utilization is > due to some issue with the code running on it or if it's blocked on I/O, > especially in a historical context. > > I naively tried on a test system changing schedule() in > cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: > > again: > /* call the scheduler */ > if ( waitFlags & INTERRUPTIBLE ) > schedule(); > else > io_schedule(); > > Seems to actually do what I'm after but generally bad things happen when > I start pretending I'm a kernel developer. > > Any thoughts? If I open an RFE would this be something that's relatively > easy to implement (not asking for a commitment *to* implement it, just > that I'm not asking for something seemingly simple that's actually > fairly hard to implement)? > > -Aaron > -- Alex Chekholko chekh at stanford.edu From aaron.s.knister at nasa.gov Mon Aug 29 18:54:12 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:54:12 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Message-ID: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This would >> be a huge help for us in determining whether a node's low utilization is >> due to some issue with the code running on it or if it's blocked on I/O, >> especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen when >> I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's relatively >> easy to implement (not asking for a commitment *to* implement it, just >> that I'm not asking for something seemingly simple that's actually >> fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 18:56:25 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 17:56:25 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> There is the iohist data that may have what you're looking for, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 12:54 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This >> would be a huge help for us in determining whether a node's low >> utilization is due to some issue with the code running on it or if >> it's blocked on I/O, especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen >> when I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's >> relatively easy to implement (not asking for a commitment *to* >> implement it, just that I'm not asking for something seemingly simple >> that's actually fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From olaf.weiser at de.ibm.com Mon Aug 29 19:02:38 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Mon, 29 Aug 2016 20:02:38 +0200 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 19:04:32 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:04:32 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number of >> other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly simple >>> that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:06:36 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:06:36 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From aaron.s.knister at nasa.gov Mon Aug 29 19:09:36 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:09:36 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:11:05 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:11:05 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063147A9@CHI-EXCHANGEW1.w2k.jumptrading.com> That's a good question, but I don't expect it should cause you much of a problem. Of course testing and trying to measure any impact would be wise, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:10 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, >> -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From sfadden at us.ibm.com Mon Aug 29 20:33:14 2016 From: sfadden at us.ibm.com (Scott Fadden) Date: Mon, 29 Aug 2016 12:33:14 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: There is a known performance issue that can possibly cause longer than expected network time-outs if you are running iohist too often. So be careful it is best to collect it as a sample, instead of all of the time. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale From: Aaron Knister To: Date: 08/29/2016 11:09 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Aug 29 20:37:13 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 19:37:13 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Hi Richard, You can of course change any of the other options with the "net conf" (/usr/lpp/mmfs/bin/net conf) command. As its just stored in the Samba registry. Of course whether or not you end up with a supported configuration is a different matter... When we first rolled out CES/SMB, there were a number of issues with setting it up in the way we needed for our environment (AD for auth, LDAP for identity) which at the time wasn't available through the config tools. I believe this has now changed though I haven't gone back and "reset" our configs. Simon ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sobey, Richard A [r.sobey at imperial.ac.uk] Sent: 22 August 2016 14:28 To: 'gpfsug-discuss at spectrumscale.org' Subject: [gpfsug-discuss] CES mmsmb options Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From usa-principal at gpfsug.org Mon Aug 29 21:13:51 2016 From: usa-principal at gpfsug.org (Spectrum Scale Users Group - USA Principal Kristy Kallback-Rose) Date: Mon, 29 Aug 2016 16:13:51 -0400 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event Message-ID: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Hello, I know many of you may be planning your SC16 schedule already. We wanted to give you a heads up that a Spectrum Scale (GPFS) Users Group event is being planned. The event will be much like last year?s event with a combination of technical updates and user experiences and thus far is loosely planned for: Sunday (11/13) ~12p - ~5 PM with a social hour after the meeting. We hope to see you there. More details as planning progresses. Best, Kristy & Bob From S.J.Thompson at bham.ac.uk Mon Aug 29 21:27:28 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 20:27:28 +0000 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event In-Reply-To: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> References: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Message-ID: You may also be interested in a panel session on the Friday of SC16: http://sc16.supercomputing.org/presentation/?id=pan120&sess=sess185 This isn't a user group event, but part of the technical programme for SC16, though I'm sure you will recognise some of the names from the storage community. Moderator: Simon Thompson (me) Panel: Sven Oehme (IBM Research) James Coomer (DDN) Sage Weil (RedHat/CEPH) Colin Morey (Hartree/STFC) Pam Gilman (NCAR) Martin Gasthuber (DESY) Friday 8:30 - 10:00 Simon From volobuev at us.ibm.com Mon Aug 29 21:31:17 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 13:31:17 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: I would advise caution on using "mmdiag --iohist" heavily. In more recent code streams (V4.1, V4.2) there's a problem with internal locking that could, under certain conditions could lead to the symptoms that look very similar to sporadic network blockage. Basically, if "mmdiag --iohist" gets blocked for long periods of time (e.g. due to local disk/NFS performance issues), this may end up blocking an mmfsd receiver thread, delaying RPC processing. The problem was discovered fairly recently, and the fix hasn't made it out to all service streams yet. More generally, IO history is a valuable tool for troubleshooting disk IO performance issues, but the tool doesn't have the right semantics for regular, systemic IO performance sampling and monitoring. The query operation is too expensive, the coverage is subject to load, and the output is somewhat unstructured. With some effort, one can still build some form of a roll-your-own monitoring implement, but this is certainly not an optimal way of approaching the problem. The data should be available in a structured form, through a channel that supports light-weight, flexible querying that doesn't impact mainline IO processing. In Spectrum Scale, this type of data is fed from mmfsd to Zimon, via an mmpmon interface, and end users can then query Zimon for raw or partially processed data. Where it comes to high-volume stats, retaining raw data at its full resolution is only practical for relatively short periods of time (seconds, or perhaps a small number of minutes), and some form of aggregation is necessary for covering longer periods of time (hours to days). In the current versions of the product, there's a very similar type of data available this way: RPC stats. There are plans to make IO history data available in a similar fashion. The entire approach may need to be re-calibrated, however. Making RPC stats available doesn't appear to have generated a surge of user interest. This is probably because the data is too complex for casual processing, and while without doubt a lot of very valuable insight can be gained by analyzing RPC stats, the actual effort required to do so is too much for most users. That is, we need to provide some tools for raw data analytics. Largely the same argument applies to IO stats. In fact, on an NSD client IO stats are actually a subset of RPC stats. With some effort, one can perform a comprehensive analysis of NSD client IO stats by analyzing NSD client-to-server RPC traffic. One can certainly argue that the effort required is a bit much though. Getting back to the original question: would the proposed cxiWaitEventWait () change work? It'll likely result in nr_iowait being incremented every time a thread in GPFS code performs an uninterruptible wait. This could be an act of performing an actual IO request, or something else, e.g. waiting for a lock. Those may be the desirable semantics in some scenarios, but I wouldn't agree that it's the right behavior for any uninterruptible wait. io_schedule() is intended for use for block device IO waits, so using it this way is not in line with the code intent, which is never a good idea. Besides, relative to schedule(), io_schedule() has some overhead that could have performance implications of an uncertain nature. yuri From: Bryan Banister To: gpfsug main discussion list , Date: 08/29/2016 11:06 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From aaron.s.knister at nasa.gov Mon Aug 29 23:58:34 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 18:58:34 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 30 06:09:21 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 22:09:21 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Message-ID: I don't see a simple fix that can be implemented by tweaking a general-purpose low-level synchronization primitive. It should be possible to integrate GPFS better into the Linux IO accounting infrastructure, but that would require some investigation a likely a non-trivial amount of work to do right. yuri From: Aaron Knister To: , Date: 08/29/2016 03:59 PM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Aug 30 09:34:33 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 08:34:33 +0000 Subject: [gpfsug-discuss] CES network aliases Message-ID: Hi all, It's Tuesday morning and that means question time :) So from http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_cesnetworkconfig.htm, I've extracted the following: How to use an alias To use an alias address for CES, you need to provide a static IP address that is not already defined as an alias in the /etc/sysconfig/network-scripts directory. Before you enable the node as a CES node, configure the network adapters for each subnet that are represented in the CES address pool: 1. Define a static IP address for the device: 2. /etc/sysconfig/network-scripts/ifcfg-eth0 3. DEVICE=eth1 4. BOOTPROTO=none 5. IPADDR=10.1.1.10 6. NETMASK=255.255.255.0 7. ONBOOT=yes 8. GATEWAY=10.1.1.1 TYPE=Ethernet 1. Ensure that there are no aliases that are defined in the network-scripts directory for this interface: 10.# ls -l /etc/sysconfig/network-scripts/ifcfg-eth1:* ls: /etc/sysconfig/network-scripts/ifcfg-eth1:*: No such file or directory After the node is enabled as a CES node, no further action is required. CES addresses are added as aliases to the already configured adapters. Now, does this mean for every floating (CES) IP address I need a separate ifcfg-ethX on each node? At the moment I simply have an ifcfg-X file representing each physical network adapter, and then the CES IPs defined. I can see IP addresses being added during failover to the primary interface, but now I've read I potentially need to create a separate file. What's the right way to move forward? If I need separate files, I presume the listed IP is a CES IP (not system) and does it also matter what X is in ifcfg-ethX? Many thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 30 10:54:31 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 30 Aug 2016 09:54:31 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Aug 30 11:30:25 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 10:30:25 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: Ace thanks jf. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 30 August 2016 10:55 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES network aliases You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 15:58:41 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 10:58:41 -0400 Subject: [gpfsug-discuss] Data Replication Message-ID: All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Aug 30 16:03:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 15:03:38 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 17:16:37 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 12:16:37 -0400 Subject: [gpfsug-discuss] gpfs native raid Message-ID: Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Tue Aug 30 17:26:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 16:26:38 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB06316445@CHI-EXCHANGEW1.w2k.jumptrading.com> I believe that Doug is going to provide more details at the NDA session at Edge... see attached, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Tuesday, August 30, 2016 11:17 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An embedded message was scrubbed... From: Douglas O'flaherty Subject: [gpfsug-discuss] Edge Attendees Date: Mon, 29 Aug 2016 05:34:03 +0000 Size: 9615 URL: From cdmaestas at us.ibm.com Tue Aug 30 17:47:18 2016 From: cdmaestas at us.ibm.com (Christopher Maestas) Date: Tue, 30 Aug 2016 16:47:18 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: Message-ID: Interestingly enough, Spectrum Scale can run on zvols. Check out: http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf -cdm On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: From: aaron.s.knister at nasa.gov To: gpfsug-discuss at spectrumscale.org Cc: Date: Aug 30, 2016 9:17:05 AM Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 18:16:03 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 13:16:03 -0400 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <96282850-6bfa-73ae-8502-9e8df3a56390@nasa.gov> Thanks Christopher. I've tried GPFS on zvols a couple times and the write throughput I get is terrible because of the required sync=always parameter. Perhaps a couple of SSD's could help get the number up, though. -Aaron On 8/30/16 12:47 PM, Christopher Maestas wrote: > Interestingly enough, Spectrum Scale can run on zvols. Check out: > > http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf > > -cdm > > ------------------------------------------------------------------------ > On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: > > From: aaron.s.knister at nasa.gov > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: Aug 30, 2016 9:17:05 AM > Subject: [gpfsug-discuss] gpfs native raid > > Does anyone know if/when we might see gpfs native raid opened up for the > masses on non-IBM hardware? It's hard to answer the question of "why > can't GPFS do this? Lustre can" in regards to Lustre's integration with > ZFS and support for RAID on commodity hardware. > -Aaron > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From laurence at qsplace.co.uk Tue Aug 30 19:50:51 2016 From: laurence at qsplace.co.uk (Laurence Horrocks-Barlow) Date: Tue, 30 Aug 2016 20:50:51 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: >The NSD Client handles the replication and will, as you stated, write >one copy to one NSD (using the primary server for this NSD) and one to >a different NSD in a different GPFS failure group (using quite likely, >but not necessarily, a different NSD server that is the primary server >for this alternate NSD). >Cheers, >-Bryan > >From: gpfsug-discuss-bounces at spectrumscale.org >[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian >Marshall >Sent: Tuesday, August 30, 2016 9:59 AM >To: gpfsug main discussion list >Subject: [gpfsug-discuss] Data Replication > >All, > >If I setup a filesystem to have data replication of 2 (2 copies of >data), does the data get replicated at the NSD Server or at the client? >i.e. Does the client send 2 copies over the network or does the NSD >Server get a single copy and then replicate on storage NSDs? > >I couldn't find a place in the docs that talked about this specific >point. > >Thank you, >Brian Marshall > >________________________________ > >Note: This email is for the confidential use of the named addressee(s) >only and may contain proprietary, confidential or privileged >information. If you are not the intended recipient, you are hereby >notified that any review, dissemination or copying of this email is >strictly prohibited, and to please notify the sender immediately and >destroy this email and any attachments. Email transmission cannot be >guaranteed to be secure or error-free. The Company, therefore, does not >make any guarantees as to the completeness or accuracy of this email or >any attachments. This email is for informational purposes only and does >not constitute a recommendation, offer, request or solicitation of any >kind to buy, sell, subscribe, redeem or perform any type of transaction >of a financial product. > > >------------------------------------------------------------------------ > >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at spectrumscale.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 19:52:54 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 14:52:54 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < laurence at qsplace.co.uk> wrote: > Its the client that does all the synchronous replication, this way the > cluster is able to scale as the clients do the leg work (so to speak). > > The somewhat "exception" is if a GPFS NSD server (or client with direct > NSD) access uses a server bases protocol such as SMB, in this case the SMB > server will do the replication as the SMB client doesn't know about GPFS or > its replication; essentially the SMB server is the GPFS client. > > -- Lauz > > On 30 August 2016 17:03:38 CEST, Bryan Banister > wrote: > >> The NSD Client handles the replication and will, as you stated, write one >> copy to one NSD (using the primary server for this NSD) and one to a >> different NSD in a different GPFS failure group (using quite likely, but >> not necessarily, a different NSD server that is the primary server for this >> alternate NSD). >> >> Cheers, >> >> -Bryan >> >> >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss- >> bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >> *Sent:* Tuesday, August 30, 2016 9:59 AM >> *To:* gpfsug main discussion list >> *Subject:* [gpfsug-discuss] Data Replication >> >> >> >> All, >> >> >> >> If I setup a filesystem to have data replication of 2 (2 copies of data), >> does the data get replicated at the NSD Server or at the client? i.e. Does >> the client send 2 copies over the network or does the NSD Server get a >> single copy and then replicate on storage NSDs? >> >> >> >> I couldn't find a place in the docs that talked about this specific point. >> >> >> >> Thank you, >> >> Brian Marshall >> >> >> ------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged information. >> If you are not the intended recipient, you are hereby notified that any >> review, dissemination or copying of this email is strictly prohibited, and >> to please notify the sender immediately and destroy this email and any >> attachments. Email transmission cannot be guaranteed to be secure or >> error-free. The Company, therefore, does not make any guarantees as to the >> completeness or accuracy of this email or any attachments. This email is >> for informational purposes only and does not constitute a recommendation, >> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >> or perform any type of transaction of a financial product. >> >> ------------------------------ >> >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Aug 30 20:09:05 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 30 Aug 2016 19:09:05 +0000 Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon From kevindjo at us.ibm.com Tue Aug 30 20:43:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 19:43:39 +0000 Subject: [gpfsug-discuss] greetings Message-ID: An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Tue Aug 30 21:39:18 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Tue, 30 Aug 2016 22:39:18 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek From kevindjo at us.ibm.com Tue Aug 30 21:51:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 20:51:39 +0000 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: An HTML attachment was scrubbed... URL: From mark.bergman at uphs.upenn.edu Tue Aug 30 22:07:21 2016 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Tue, 30 Aug 2016 17:07:21 -0400 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: Your message of "Tue, 30 Aug 2016 22:39:18 +0200." <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: <24437-1472591241.445832@bR6O.TofS.917u> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman From xhejtman at ics.muni.cz Tue Aug 30 23:02:50 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 00:02:50 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: Message-ID: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek From oehmes at gmail.com Wed Aug 31 00:24:59 2016 From: oehmes at gmail.com (Sven Oehme) Date: Tue, 30 Aug 2016 16:24:59 -0700 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Aug 31 05:00:45 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 31 Aug 2016 04:00:45 +0000 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Just want to add on to one of the points Sven touched on regarding metadata HW. We have a modest SSD infrastructure for our metadata disks and we can scan 500M inodes in parallel in about 5 hours if my memory serves me right (and I believe we could go faster if we really wanted to). I think having solid metadata disks (no pun intended) will really help with scan times. From: Sven Oehme Sent: 8/30/16, 7:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek > wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Aug 31 05:52:57 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 31 Aug 2016 06:52:57 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From dominic.mueller at de.ibm.com Wed Aug 31 06:52:38 2016 From: dominic.mueller at de.ibm.com (Dominic Mueller-Wicke01) Date: Wed, 31 Aug 2016 07:52:38 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: Thanks for reading the paper. I agree that the restore of a large number of files is a challenge today. The restore is the focus area for future enhancements for the integration between IBM Spectrum Scale and IBM Spectrum Protect. If something will be available that helps to improve the restore capabilities the paper will be updated with this information. Greetings, Dominic. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 31.08.2016 01:25 Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ----- Message from "Simon Thompson (Research Computing - IT Services)" on Tue, 30 Aug 2016 19:09:05 +0000 ----- To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 19:43:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings I'm in Lab Services at IBM - just joining and happy to help any way I can. Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Message from Lukas Hejtmanek on Tue, 30 Aug 2016 22:39:18 +0200 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 20:51:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 RHEL 6.8/2.6.32-642 requires 4.1.1.8 or 4.2.1. You can either go to 6.7 for GPFS 3.5 or bump it up to 7.0/7.1. See Table 13, here: http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html?view=kc#linuxq Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Original message ----- From: Lukas Hejtmanek Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Date: Tue, Aug 30, 2016 4:39 PM Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ----- Message from mark.bergman at uphs.upenn.edu on Tue, 30 Aug 2016 17:07:21 -0400 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ----- Message from Lukas Hejtmanek on Wed, 31 Aug 2016 00:02:50 +0200 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ----- Message from Sven Oehme on Tue, 30 Aug 2016 16:24:59 -0700 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From xhejtman at ics.muni.cz Wed Aug 31 08:03:08 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 09:03:08 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: <20160831070308.fiogolgc2nhna6ir@ics.muni.cz> On Wed, Aug 31, 2016 at 07:52:38AM +0200, Dominic Mueller-Wicke01 wrote: > Thanks for reading the paper. I agree that the restore of a large number of > files is a challenge today. The restore is the focus area for future > enhancements for the integration between IBM Spectrum Scale and IBM > Spectrum Protect. If something will be available that helps to improve the > restore capabilities the paper will be updated with this information. I guess that one of the reasons that restore is slow is because this: (strace dsmc) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en/_referencenotitsig", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home", F_OK) = 0 [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum", F_OK) = 0 it seems that dsmc tests access again and again up to root for each item in the file list if I set different location where to place the restored files. -- Luk?? Hejtm?nek From duersch at us.ibm.com Wed Aug 31 13:45:12 2016 From: duersch at us.ibm.com (Steve Duersch) Date: Wed, 31 Aug 2016 08:45:12 -0400 Subject: [gpfsug-discuss] Maximum value for data replication? In-Reply-To: References: Message-ID: >>Is there a maximum value for data replication in Spectrum Scale? The maximum value for replication is 3. Steve Duersch Spectrum Scale RAID 845-433-7902 IBM Poughkeepsie, New York From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 08/30/2016 07:25 PM Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ---------------------------------------------------------------------- Message: 1 Date: Tue, 30 Aug 2016 19:09:05 +0000 From: "Simon Thompson (Research Computing - IT Services)" To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Content-Type: text/plain; charset="us-ascii" Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ------------------------------ Message: 2 Date: Tue, 30 Aug 2016 19:43:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/5a2e22a3/attachment-0001.html > ------------------------------ Message: 3 Date: Tue, 30 Aug 2016 22:39:18 +0200 From: Lukas Hejtmanek To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ------------------------------ Message: 4 Date: Tue, 30 Aug 2016 20:51:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/341d5e11/attachment-0001.html > ------------------------------ Message: 5 Date: Tue, 30 Aug 2016 17:07:21 -0400 From: mark.bergman at uphs.upenn.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <24437-1472591241.445832 at bR6O.TofS.917u> Content-Type: text/plain; charset="UTF-8" In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ------------------------------ Message: 6 Date: Wed, 31 Aug 2016 00:02:50 +0200 From: Lukas Hejtmanek To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <20160830220250.yt6r7gvfq7rlvtcs at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ------------------------------ Message: 7 Date: Tue, 30 Aug 2016 16:24:59 -0700 From: Sven Oehme To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Content-Type: text/plain; charset="utf-8" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/d9b3fb68/attachment.html > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 55 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From daniel.kidger at uk.ibm.com Wed Aug 31 15:32:11 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 31 Aug 2016 14:32:11 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: Message-ID: The other 'Exception' is when a rule is used to convert a 1 way replicated file to 2 way, or when only one failure group is up due to HW problems. It that case the (re-replication) is done by whatever nodes are used for the rule or command-line, which may include an NSD server. Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: From: mimarsh2 at vt.edu To: gpfsug-discuss at spectrumscale.org Cc: Date: 30 Aug 2016 19:53:31 Subject: Re: [gpfsug-discuss] Data Replication Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow wrote: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discussUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Wed Aug 31 19:01:45 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Wed, 31 Aug 2016 14:01:45 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Daniel, So here's my use case: I have a Sandisk IF150 (branded as DeepFlash recently) with 128TB of flash acting as a "fast tier" storage pool in our HPC scratch file system. Can I set the filesystem replication level to 1 then write a policy engine rule to send small and/or recent files to the IF150 with a replication of 2? Any other comments on the proposed usage strategy are helpful. Thank you, Brian Marshall On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger wrote: > The other 'Exception' is when a rule is used to convert a 1 way replicated > file to 2 way, or when only one failure group is up due to HW problems. It > that case the (re-replication) is done by whatever nodes are used for the > rule or command-line, which may include an NSD server. > > Daniel > > IBM Spectrum Storage Software > +44 (0)7818 522266 <+44%207818%20522266> > Sent from my iPad using IBM Verse > > > ------------------------------ > On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: > > From: mimarsh2 at vt.edu > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: 30 Aug 2016 19:53:31 > Subject: Re: [gpfsug-discuss] Data Replication > > > Thanks. This confirms the numbers that I am seeing. > > Brian > > On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < > laurence at qsplace.co.uk> wrote: > >> Its the client that does all the synchronous replication, this way the >> cluster is able to scale as the clients do the leg work (so to speak). >> >> The somewhat "exception" is if a GPFS NSD server (or client with direct >> NSD) access uses a server bases protocol such as SMB, in this case the SMB >> server will do the replication as the SMB client doesn't know about GPFS or >> its replication; essentially the SMB server is the GPFS client. >> >> -- Lauz >> >> On 30 August 2016 17:03:38 CEST, Bryan Banister < >> bbanister at jumptrading.com> wrote: >> >>> The NSD Client handles the replication and will, as you stated, write >>> one copy to one NSD (using the primary server for this NSD) and one to a >>> different NSD in a different GPFS failure group (using quite likely, but >>> not necessarily, a different NSD server that is the primary server for this >>> alternate NSD). >>> >>> Cheers, >>> >>> -Bryan >>> >>> >>> >>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>> *To:* gpfsug main discussion list >>> *Subject:* [gpfsug-discuss] Data Replication >>> >>> >>> >>> All, >>> >>> >>> >>> If I setup a filesystem to have data replication of 2 (2 copies of >>> data), does the data get replicated at the NSD Server or at the client? >>> i.e. Does the client send 2 copies over the network or does the NSD Server >>> get a single copy and then replicate on storage NSDs? >>> >>> >>> >>> I couldn't find a place in the docs that talked about this specific >>> point. >>> >>> >>> >>> Thank you, >>> >>> Brian Marshall >>> >>> >>> ------------------------------ >>> >>> Note: This email is for the confidential use of the named addressee(s) >>> only and may contain proprietary, confidential or privileged information. >>> If you are not the intended recipient, you are hereby notified that any >>> review, dissemination or copying of this email is strictly prohibited, and >>> to please notify the sender immediately and destroy this email and any >>> attachments. Email transmission cannot be guaranteed to be secure or >>> error-free. The Company, therefore, does not make any guarantees as to the >>> completeness or accuracy of this email or any attachments. This email is >>> for informational purposes only and does not constitute a recommendation, >>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>> or perform any type of transaction of a financial product. >>> >>> ------------------------------ >>> >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Aug 31 19:10:07 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 31 Aug 2016 14:10:07 -0400 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" - how about a Billion files in 140 seconds? In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: When you write something like "mmbackup takes ages" - that let's us know how you feel, kinda. But we need some facts and data to make a determination if there is a real problem and whether and how it might be improved. Just to do a "back of the envelope" estimate of how long backup operations "ought to" take - we'd need to know how many disks and/or SSDs with what performance characteristics, how many nodes withf what performance characteristics, network "fabric(s)", Number of files to be scanned, Average number of files per directory, GPFS blocksize(s) configured, Backup devices available with speeds and feeds, etc, etc. But anyway just to throw ballpark numbers "out there" to give you an idea of what is possible. I can tell you that a 20 months ago Sven and I benchmarked mmapplypolicy scanning 983 Million files in 136 seconds! The command looked like this: mmapplypolicy /ibm/fs2-1m-p01/shared/Btt -g /ibm/fs2-1m-p01/tmp -d 7 -A 256 -a 32 -n 8 -P /ghome/makaplan/sventests/milli.policy -I test -L 1 -N fastclients fastclients was 10 X86_64 commodity nodes The fs2-1m-p01 file system was hosted on just two IBM GSS nodes and everything was on an Infiniband switch. We packed about 7000 files into each directory.... (This admittedly may not be typical...) This is NOT to say you could back up that many files that fast, but Spectrum Scale metadata scanning can be fast, even with relatively modest hardware resources. YMMV ;-) Marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Wed Aug 31 19:39:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 20:39:26 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: <20160831183926.k4mbwbbrmxybd7a3@ics.muni.cz> On Tue, Aug 30, 2016 at 04:24:59PM -0700, Sven Oehme wrote: > so lets start with some simple questions. > > when you say mmbackup takes ages, what version of gpfs code are you running > ? that was GPFS 3.5.0-8. The mmapplypolicy took over 2 hours but that was the least problem. We developed our own set of backups scripts around mmbackup to address these issues: 1) while mmbackup is running, you cannot run another instance on the same file system. 2) mmbackup can be very slow, but not mmbackup itself but consecutive dsmc selective, sorry for being misleading, but mainly due to the large number of files to be backed up 3) related to the previous, mmbackup scripts seem to be executing a 'grep' cmd for every input file to check whether it has entry in dmsc output log. well guess what happens if you have millions of files at the input and several gigabytes in dsmc outpu log... In our case, the grep storm took several *weeks*. 4) very surprisingly, some of the files were not backed up at all. We cannot find why but dsmc incremental found some old files that were not covered by mmbackup backups. Maybe because the mmbackup process was not gracefully terminated in some cases (node crash) and so on. > how do you execute the mmbackup command ? exact parameters would be useful > . /usr/lpp/mmfs/bin/mmbackup tape_tape -t incremental -v -N fe1 -P ${POLICY_FILE} --tsm-servers SERVER1 -g /gpfs/clusterbase/tmp/ -s /tmp -m 4 -B 9999999999999 -L 0 we had external exec script that split files from policy into chunks that were run in parallel. > what HW are you using for the metadata disks ? 4x SSD > how much capacity (df -h) and how many inodes (df -i) do you have in the > filesystem you try to backup ? df -h /dev/tape_tape 1.5P 745T 711T 52% /exports/tape_tape df -hi /dev/tape_tape 1.3G 98M 1.2G 8% /exports/tape_tape (98M inodes used) mmdf tape_tape disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 175 TB) nsd_t1_5 23437934592 1 No Yes 7342735360 ( 31%) 133872128 ( 1%) nsd_t1_6 23437934592 1 No Yes 7341166592 ( 31%) 133918784 ( 1%) nsd_t1b_2 23437934592 1 No Yes 7343919104 ( 31%) 134165056 ( 1%) nsd_t1b_3 23437934592 1 No Yes 7341283328 ( 31%) 133986560 ( 1%) nsd_ssd_4 770703360 2 Yes No 692172800 ( 90%) 15981952 ( 2%) nsd_ssd_3 770703360 2 Yes No 692252672 ( 90%) 15921856 ( 2%) nsd_ssd_2 770703360 2 Yes No 692189184 ( 90%) 15928832 ( 2%) nsd_ssd_1 770703360 2 Yes No 692197376 ( 90%) 16013248 ( 2%) ------------- -------------------- ------------------- (pool total) 96834551808 32137916416 ( 33%) 599788416 ( 1%) Disks in storage pool: maid (Maximum disk size allowed is 466 TB) nsd8_t2_12 31249989632 1 No Yes 13167828992 ( 42%) 36282048 ( 0%) nsd8_t2_13 31249989632 1 No Yes 13166729216 ( 42%) 36131072 ( 0%) nsd8_t2_14 31249989632 1 No Yes 13166886912 ( 42%) 36371072 ( 0%) nsd8_t2_15 31249989632 1 No Yes 13168209920 ( 42%) 36681728 ( 0%) nsd8_t2_16 31249989632 1 No Yes 13165176832 ( 42%) 36279488 ( 0%) nsd8_t2_17 31249989632 1 No Yes 13159870464 ( 42%) 36002560 ( 0%) nsd8_t2_46 31249989632 1 No Yes 29624694784 ( 95%) 81600 ( 0%) nsd8_t2_45 31249989632 1 No Yes 29623111680 ( 95%) 77184 ( 0%) nsd8_t2_44 31249989632 1 No Yes 29621467136 ( 95%) 61440 ( 0%) nsd8_t2_43 31249989632 1 No Yes 29622964224 ( 95%) 64640 ( 0%) nsd8_t2_18 31249989632 1 No Yes 13166675968 ( 42%) 36147648 ( 0%) nsd8_t2_19 31249989632 1 No Yes 13164529664 ( 42%) 36225216 ( 0%) nsd8_t2_20 31249989632 1 No Yes 13165223936 ( 42%) 36242368 ( 0%) nsd8_t2_21 31249989632 1 No Yes 13167353856 ( 42%) 36007744 ( 0%) nsd8_t2_31 31249989632 1 No Yes 13116979200 ( 42%) 14155200 ( 0%) nsd8_t2_32 31249989632 1 No Yes 13115633664 ( 42%) 14243840 ( 0%) nsd8_t2_33 31249989632 1 No Yes 13115830272 ( 42%) 14235392 ( 0%) nsd8_t2_34 31249989632 1 No Yes 13119727616 ( 42%) 14500608 ( 0%) nsd8_t2_35 31249989632 1 No Yes 13116925952 ( 42%) 14304192 ( 0%) nsd8_t2_0 31249989632 1 No Yes 13145503744 ( 42%) 99222016 ( 0%) nsd8_t2_36 31249989632 1 No Yes 13119858688 ( 42%) 14054784 ( 0%) nsd8_t2_37 31249989632 1 No Yes 13114101760 ( 42%) 14200704 ( 0%) nsd8_t2_38 31249989632 1 No Yes 13116483584 ( 42%) 14174720 ( 0%) nsd8_t2_39 31249989632 1 No Yes 13121257472 ( 42%) 14094720 ( 0%) nsd8_t2_40 31249989632 1 No Yes 29622908928 ( 95%) 84352 ( 0%) nsd8_t2_1 31249989632 1 No Yes 13146089472 ( 42%) 99566784 ( 0%) nsd8_t2_2 31249989632 1 No Yes 13146208256 ( 42%) 99128960 ( 0%) nsd8_t2_3 31249989632 1 No Yes 13146890240 ( 42%) 99766720 ( 0%) nsd8_t2_4 31249989632 1 No Yes 13145143296 ( 42%) 98992576 ( 0%) nsd8_t2_5 31249989632 1 No Yes 13135876096 ( 42%) 99555008 ( 0%) nsd8_t2_6 31249989632 1 No Yes 13142831104 ( 42%) 99728064 ( 0%) nsd8_t2_7 31249989632 1 No Yes 13140283392 ( 42%) 99412480 ( 0%) nsd8_t2_8 31249989632 1 No Yes 13143470080 ( 42%) 99653696 ( 0%) nsd8_t2_9 31249989632 1 No Yes 13143650304 ( 42%) 99224704 ( 0%) nsd8_t2_10 31249989632 1 No Yes 13145440256 ( 42%) 99238528 ( 0%) nsd8_t2_11 31249989632 1 No Yes 13143201792 ( 42%) 99283008 ( 0%) nsd8_t2_22 31249989632 1 No Yes 13171724288 ( 42%) 36040704 ( 0%) nsd8_t2_23 31249989632 1 No Yes 13166782464 ( 42%) 36212416 ( 0%) nsd8_t2_24 31249989632 1 No Yes 13167990784 ( 42%) 35842368 ( 0%) nsd8_t2_25 31249989632 1 No Yes 13166972928 ( 42%) 36086848 ( 0%) nsd8_t2_26 31249989632 1 No Yes 13167495168 ( 42%) 36114496 ( 0%) nsd8_t2_27 31249989632 1 No Yes 13164419072 ( 42%) 36119680 ( 0%) nsd8_t2_28 31249989632 1 No Yes 13167804416 ( 42%) 36088832 ( 0%) nsd8_t2_29 31249989632 1 No Yes 13166057472 ( 42%) 36107072 ( 0%) nsd8_t2_30 31249989632 1 No Yes 13163673600 ( 42%) 36102528 ( 0%) nsd8_t2_41 31249989632 1 No Yes 29620840448 ( 95%) 70208 ( 0%) nsd8_t2_42 31249989632 1 No Yes 29621110784 ( 95%) 69568 ( 0%) ------------- -------------------- ------------------- (pool total) 1468749512704 733299890176 ( 50%) 2008331584 ( 0%) ============= ==================== =================== (data) 1562501251072 762668994560 ( 49%) 2544274112 ( 0%) (metadata) 3082813440 2768812032 ( 90%) 63845888 ( 2%) ============= ==================== =================== (total) 1565584064512 765437806592 ( 49%) 2608120000 ( 0%) Inode Information ----------------- Number of used inodes: 102026081 Number of free inodes: 72791199 Number of allocated inodes: 174817280 Maximum number of inodes: 1342177280 -- Luk?? Hejtm?nek From xhejtman at ics.muni.cz Wed Aug 31 20:26:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 21:26:26 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <24437-1472591241.445832@bR6O.TofS.917u> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> Message-ID: <20160831192626.k4em4iz7ne2e2cmg@ics.muni.cz> Hello, thank you for explanation. I confirm that things are working with 573 kernel. On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: > In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, > The pithy ruminations from Lukas Hejtmanek on > <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: > => Hello, > > GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, > but at kernel 2.6.32-573 and lower. > > I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel > revs that caused multipath errors, resulting in GPFS being unable to > find all NSDs and mount the filesystem. > > I am not updating to a newer kernel until I'm certain this is resolved. > > I opened a bug with CentOS: > > https://bugs.centos.org/view.php?id=10997 > > and began an extended discussion with the (RH & SUSE) developers of that > chunk of kernel code. I don't know if an upstream bug has been opened > by RH, but see: > > https://patchwork.kernel.org/patch/9140337/ > => > => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the > => latest patch 32) does start but does not mount and file system. The internal > => mount cmd gets stucked. > => > => -- > => Luk?? Hejtm?nek > > > -- > Mark Bergman voice: 215-746-4061 > mark.bergman at uphs.upenn.edu fax: 215-614-0266 > http://www.cbica.upenn.edu/ > IT Technical Director, Center for Biomedical Image Computing and Analytics > Department of Radiology University of Pennsylvania > PGP Key: http://www.cbica.upenn.edu/sbia/bergman > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Luk?? Hejtm?nek From wilshire at mcs.anl.gov Wed Aug 31 20:39:17 2016 From: wilshire at mcs.anl.gov (John Blaas) Date: Wed, 31 Aug 2016 14:39:17 -0500 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> Message-ID: We are running 3.5 w/ patch 32 on nodes with the storage cluster running on Centos 6.8 with kernel at 2.6.32-642.1.1 and the remote compute cluster running 2.6.32-642.3.1 without any issues. That being said we are looking to upgrade as soon as possible to 4.1, but thought I would add that it is possible even if not supported. --- John Blaas On Wed, Aug 31, 2016 at 2:26 PM, Lukas Hejtmanek wrote: > Hello, > > thank you for explanation. I confirm that things are working with 573 kernel. > > On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: >> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, >> The pithy ruminations from Lukas Hejtmanek on >> <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: >> => Hello, >> >> GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, >> but at kernel 2.6.32-573 and lower. >> >> I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel >> revs that caused multipath errors, resulting in GPFS being unable to >> find all NSDs and mount the filesystem. >> >> I am not updating to a newer kernel until I'm certain this is resolved. >> >> I opened a bug with CentOS: >> >> https://bugs.centos.org/view.php?id=10997 >> >> and began an extended discussion with the (RH & SUSE) developers of that >> chunk of kernel code. I don't know if an upstream bug has been opened >> by RH, but see: >> >> https://patchwork.kernel.org/patch/9140337/ >> => >> => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the >> => latest patch 32) does start but does not mount and file system. The internal >> => mount cmd gets stucked. >> => >> => -- >> => Luk?? Hejtm?nek >> >> >> -- >> Mark Bergman voice: 215-746-4061 >> mark.bergman at uphs.upenn.edu fax: 215-614-0266 >> http://www.cbica.upenn.edu/ >> IT Technical Director, Center for Biomedical Image Computing and Analytics >> Department of Radiology University of Pennsylvania >> PGP Key: http://www.cbica.upenn.edu/sbia/bergman >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From janfrode at tanso.net Wed Aug 31 21:44:04 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 31 Aug 2016 22:44:04 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Assuming your DeepFlash pool is named "deep", something like the following should work: RULE 'deepreplicate' migrate from pool 'deep' to pool 'deep' replicate(2) where MISC_ATTRIBUTES NOT LIKE '%2%' and POOL_NAME LIKE 'deep' "mmapplypolicy gpfs0 -P replicate-policy.pol -I yes" and possibly "mmrestripefs gpfs0 -r" afterwards. -jf On Wed, Aug 31, 2016 at 8:01 PM, Brian Marshall wrote: > Daniel, > > So here's my use case: I have a Sandisk IF150 (branded as DeepFlash > recently) with 128TB of flash acting as a "fast tier" storage pool in our > HPC scratch file system. Can I set the filesystem replication level to 1 > then write a policy engine rule to send small and/or recent files to the > IF150 with a replication of 2? > > Any other comments on the proposed usage strategy are helpful. > > Thank you, > Brian Marshall > > On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger > wrote: > >> The other 'Exception' is when a rule is used to convert a 1 way >> replicated file to 2 way, or when only one failure group is up due to HW >> problems. It that case the (re-replication) is done by whatever nodes are >> used for the rule or command-line, which may include an NSD server. >> >> Daniel >> >> IBM Spectrum Storage Software >> +44 (0)7818 522266 <+44%207818%20522266> >> Sent from my iPad using IBM Verse >> >> >> ------------------------------ >> On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: >> >> From: mimarsh2 at vt.edu >> To: gpfsug-discuss at spectrumscale.org >> Cc: >> Date: 30 Aug 2016 19:53:31 >> Subject: Re: [gpfsug-discuss] Data Replication >> >> >> Thanks. This confirms the numbers that I am seeing. >> >> Brian >> >> On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < >> laurence at qsplace.co.uk> wrote: >> >>> Its the client that does all the synchronous replication, this way the >>> cluster is able to scale as the clients do the leg work (so to speak). >>> >>> The somewhat "exception" is if a GPFS NSD server (or client with direct >>> NSD) access uses a server bases protocol such as SMB, in this case the SMB >>> server will do the replication as the SMB client doesn't know about GPFS or >>> its replication; essentially the SMB server is the GPFS client. >>> >>> -- Lauz >>> >>> On 30 August 2016 17:03:38 CEST, Bryan Banister < >>> bbanister at jumptrading.com> wrote: >>> >>>> The NSD Client handles the replication and will, as you stated, write >>>> one copy to one NSD (using the primary server for this NSD) and one to a >>>> different NSD in a different GPFS failure group (using quite likely, but >>>> not necessarily, a different NSD server that is the primary server for this >>>> alternate NSD). >>>> >>>> Cheers, >>>> >>>> -Bryan >>>> >>>> >>>> >>>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>>> *To:* gpfsug main discussion list >>>> *Subject:* [gpfsug-discuss] Data Replication >>>> >>>> >>>> >>>> All, >>>> >>>> >>>> >>>> If I setup a filesystem to have data replication of 2 (2 copies of >>>> data), does the data get replicated at the NSD Server or at the client? >>>> i.e. Does the client send 2 copies over the network or does the NSD Server >>>> get a single copy and then replicate on storage NSDs? >>>> >>>> >>>> >>>> I couldn't find a place in the docs that talked about this specific >>>> point. >>>> >>>> >>>> >>>> Thank you, >>>> >>>> Brian Marshall >>>> >>>> >>>> ------------------------------ >>>> >>>> Note: This email is for the confidential use of the named addressee(s) >>>> only and may contain proprietary, confidential or privileged information. >>>> If you are not the intended recipient, you are hereby notified that any >>>> review, dissemination or copying of this email is strictly prohibited, and >>>> to please notify the sender immediately and destroy this email and any >>>> attachments. Email transmission cannot be guaranteed to be secure or >>>> error-free. The Company, therefore, does not make any guarantees as to the >>>> completeness or accuracy of this email or any attachments. This email is >>>> for informational purposes only and does not constitute a recommendation, >>>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>>> or perform any type of transaction of a financial product. >>>> >>>> ------------------------------ >>>> >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Mon Aug 1 17:42:19 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 1 Aug 2016 09:42:19 -0700 Subject: [gpfsug-discuss] Spectrum Scale 4.2.1 Released In-Reply-To: References: Message-ID: Thanks for sharing Bob. Since some folks asked previously, if you go to the 4.2.1 FAQ PDF version there will be change bars on the left for what changed in FAQ from previous version as well as a FAQ July updates table near the top to quickly highlight the changes from last FAQ. http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.pdf?view=kc Also, two short blogs on the 4.2.1 release on the Storage Community might be of interest: http://storagecommunity.org/easyblog -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 19:36:15 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 14:36:15 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) Message-ID: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. From oehmes at gmail.com Mon Aug 1 19:49:37 2016 From: oehmes at gmail.com (Sven Oehme) Date: Mon, 1 Aug 2016 11:49:37 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: > I have enabled write cache (HAWC) by running the below commands. The > recovery logs are supposedly placed in the replicated system metadata pool > (SSDs). I do not have a "system.log" pool as it is only needed if recovery > logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including the NSD > nodes). > > I still see small synchronous writes (4K) from the clients going to the > data drives (data pool). I am checking this by looking at "mmdiag --iohist" > output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is working as > advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 20:05:52 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 15:05:52 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), they don't > leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao > wrote: > > I have enabled write cache (HAWC) by running the below commands. > The recovery logs are supposedly placed in the replicated system > metadata pool (SSDs). I do not have a "system.log" pool as it is > only needed if recovery logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including > the NSD nodes). > > I still see small synchronous writes (4K) from the clients going > to the data drives (data pool). I am checking this by looking at > "mmdiag --iohist" output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is > working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From dhildeb at us.ibm.com Mon Aug 1 20:50:09 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 12:50:09 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ?? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From raot at bnl.gov Mon Aug 1 21:42:06 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 16:42:06 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: > > Hi Tejas, > > Do you know the workload in the VM? > > The workload which enters into HAWC may or may not be the same as the > workload that eventually goes into the data pool....it all depends on > whether the 4KB writes entering HAWC can be coalesced or not. For > example, sequential 4KB writes can all be coalesced into a single > large chunk. So 4KB writes into HAWC will convert into 8MB writes to > data pool (in your system). But random 4KB writes into HAWC may end up > being 4KB writes into the data pool if there are no adjoining 4KB > writes (i.e., if 4KB blocks are all dispersed, they can't be > coalesced). The goal of HAWC though, whether the 4KB blocks are > coalesced or not, is to reduce app latency by ensuring that writing > the blocks back to the data pool is done in the background. So while > 4KB blocks may still be hitting the data pool, hopefully the > application is seeing the latency of your presumably lower latency > system pool. > > Dean > > > Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my > case GPFS storage is used to store VM images (KVM) and heTejas Rao > ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store > VM images (KVM) and hence the small IO. > > From: Tejas Rao > To: gpfsug main discussion list > Date: 08/01/2016 12:06 PM > Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > In my case GPFS storage is used to store VM images (KVM) and hence the > small IO. > > I always see lots of small 4K writes and the GPFS filesystem block > size is 8MB. I thought the reason for the small writes is that the > linux kernel requests GPFS to initiate a periodic sync which by > default is every 5 seconds and can be controlled by > "vm.dirty_writeback_centisecs". > > I thought HAWC would help in such cases and would harden (coalesce) > the small writes in the "system" pool and would flush to the "data" > pool in larger block size. > > Note - I am not doing direct i/o explicitly. > > > > On 8/1/2016 14:49, Sven Oehme wrote: > > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), > they don't leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao <_raot at bnl.gov_ > > wrote: > I have enabled write cache (HAWC) by running the below > commands. The recovery logs are supposedly placed in the > replicated system metadata pool (SSDs). I do not have a > "system.log" pool as it is only needed if recovery logs > are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster > (including the NSD nodes). > > I still see small synchronous writes (4K) from the clients > going to the data drives (data pool). I am checking this > by looking at "mmdiag --iohist" output. Should they not be > going to the system pool? > > Do I need to do something else? How can I confirm that > HAWC is working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at _spectrumscale.org_ > _ > __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 105 bytes Desc: not available URL: From dhildeb at us.ibm.com Mon Aug 1 21:55:28 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 13:55:28 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov><5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> Message-ID: Hi Tejas, Yes, most likely those 4k writes are the HAWC writes...hopefully those 4KB writes have a lower latency than the 4k writes to your data pool so you are realizing the benefits. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 01:42 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and heTejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and hence the small IO. From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Wed Aug 3 06:06:32 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 05:06:32 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain Message-ID: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Wed Aug 3 08:32:43 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 07:32:43 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain In-Reply-To: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> References: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> Message-ID: <663114b24b0b403aa076a83791f32c58@exch1-cdc.nexus.csiro.au> And I am seeing the same behaviour on a SLES 12 SP1 update from 4.2.04 to 4.2.1.0. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au Sent: Wednesday, 3 August 2016 3:07 PM To: gpfsug-discuss at spectrumscale.org Subject: [ExternalEmail] [gpfsug-discuss] SS 4.2.1.0 upgrade pain On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Wed Aug 3 09:54:30 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Wed, 3 Aug 2016 10:54:30 +0200 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Message-ID: <57A1B146.9070505@ugent.be> Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth From bbanister at jumptrading.com Wed Aug 3 15:53:52 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 3 Aug 2016 14:53:52 +0000 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 In-Reply-To: <57A1B146.9070505@ugent.be> References: <57A1B146.9070505@ugent.be> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062B3718@CHI-EXCHANGEW1.w2k.jumptrading.com> Your first process is correct. Install the 4.2.0-0 rpms first, then install the 4.2.1 rpms after. -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman Sent: Wednesday, August 03, 2016 3:55 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From pinto at scinet.utoronto.ca Wed Aug 3 17:22:27 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:22:27 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? Message-ID: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From oehmes at gmail.com Wed Aug 3 17:35:39 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 09:35:39 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for that > user? > > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user be > stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 17:41:24 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:41:24 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803124124.21815zz1w4exmuus@support.scinet.utoronto.ca> Quoting "Sven Oehme" : > Hi, > > quotas are only counted against primary group > > sven Thanks Sven I kind of suspected, but needed an independent confirmation. Jaime > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: > >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for that >> user? >> >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user be >> stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> >> ---------------------------------------------------------------- >> This message was sent using IMP at SciNet Consortium, University of >> Toronto. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From jonathan at buzzard.me.uk Wed Aug 3 17:44:01 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 17:44:01 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> On 03/08/16 17:22, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for > that user? Nope that's not how POSIX schematics work for group quotas. As far as I can tell only your primary group is used for group quotas. It basically makes group quotas in Unix a waste of time in my opinion. At least I have never come across a real world scenario where they work in a useful manner. > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user > be stopped by gpfs? > File sets are the answer to your problems, but retrospectively applying them to a file system is a pain. You create a file set for a directory and can then apply a quota to the file set. Even better you can apply per file set user and group quotas. So if file set A has a 1TB quota you could limit user X to 100GB in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From pinto at scinet.utoronto.ca Wed Aug 3 17:55:43 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:55:43 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for >> that user? > > Nope that's not how POSIX schematics work for group quotas. As far as I > can tell only your primary group is used for group quotas. It basically > makes group quotas in Unix a waste of time in my opinion. At least I > have never come across a real world scenario where they work in a > useful manner. > >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user >> be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively applying > them to a file system is a pain. You create a file set for a directory > and can then apply a quota to the file set. Even better you can apply > per file set user and group quotas. So if file set A has a 1TB quota > you could limit user X to 100GB in the file set, but outside the file > set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:06:34 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:06:34 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 19:30:08 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 14:30:08 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Quoting "Buterbaugh, Kevin L" : > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > > Thanks for clarifying? > > Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary group > as the attribute, and that overall group quota is reached, will > that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:34:21 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:34:21 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Kevin On Aug 3, 2016, at 1:30 PM, Jaime Pinto > wrote: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 19:46:54 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 19:46:54 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:55:01 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:55:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: JAB, The set group id bit is tangential to my point. I expect GPFS to count any files a user owns against their user quota. If they are a member of multiple groups then I also expect it to count it against the group quota of whatever group is associated with that file. I.e., if they do a chgrp then GPFS should subtract from one group and add to another. Kevin On Aug 3, 2016, at 1:46 PM, Jonathan Buzzard > wrote: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 20:13:09 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 20:13:09 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Message-ID: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 20:18:11 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 19:18:11 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> Message-ID: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> JAB, Our scratch filesystem uses user and group quotas. It started out as a traditional scratch filesystem but then we decided (for better or worse) to allow groups to purchase quota on it (and we don?t purge it, as many sites do). We have many users in multiple groups, so if this is not working right it?s a potential issue for us. But you?re right, I?m a nobody? Kevin On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: On 03/08/16 19:34, Buterbaugh, Kevin L wrote: Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at gmail.com Wed Aug 3 21:32:32 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 13:32:32 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> Message-ID: i can't contribute much to the usefulness of tracking primary or secondary group. depending on who you ask you get a 50/50 answer why its great or broken either way. Jonathan explanation was correct, we only track/enforce primary groups , we don't do anything with secondary groups in regards to quotas. if there is 'doubt' of correct quotation of files on the disk in the filesystem one could always run mmcheckquota, its i/o intensive but will match quota usage of the in memory 'assumption' and update it from the actual data thats stored on disk. sven On Wed, Aug 3, 2016 at 12:18 PM, Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > JAB, > > Our scratch filesystem uses user and group quotas. It started out as a > traditional scratch filesystem but then we decided (for better or worse) to > allow groups to purchase quota on it (and we don?t purge it, as many sites > do). > > We have many users in multiple groups, so if this is not working right > it?s a potential issue for us. But you?re right, I?m a nobody? > > Kevin > > On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: > > On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > > > Even if Jamie's interpretation is wrong it shows the other massive failure > of group quotas under Unix and why they are not fit for purpose in the real > world. > > So bufh here can deliberately or accidentally do a denial of service on > other users and tracking down the offending user is a right pain in the > backside. > > The point of being able to change group ownership on a file is to indicate > the massive weakness of the whole group quota system, and why in my > experience nobody actually uses it, and "project" quota options have been > implemented in many "enterprise" Unix file systems. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Aug 4 00:03:47 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 23:03:47 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> Message-ID: <762ff4f5796c4992b3bceb23b26fdbf3@exch1-cdc.nexus.csiro.au> The GID selection rules for account creation are Linux distribution specific. It sounds like you are familiar with Red Hat, where I think this idea of GID=UID started. sles12sp1-brc:/dev/disk/by-uuid # useradd testout sles12sp1-brc:/dev/disk/by-uuid # grep testout /etc/passwd testout:x:1001:100::/home/testout:/bin/bash sles12sp1-brc:/dev/disk/by-uuid # grep 100 /etc/group users:x:100: sles12sp1-brc:/dev/disk/by-uuid # Cheers, Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jaime Pinto Sent: Thursday, 4 August 2016 2:56 AM To: gpfsug main discussion list ; Jonathan Buzzard Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP >> is not the primary group. Will gpfs enforce the secondary group quota >> for that user? > > Nope that's not how POSIX schematics work for group quotas. As far as > I can tell only your primary group is used for group quotas. It > basically makes group quotas in Unix a waste of time in my opinion. At > least I have never come across a real world scenario where they work > in a useful manner. > >> What I mean is, if the user keeps writing files with secondary group >> as the attribute, and that overall group quota is reached, will that >> user be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively > applying them to a file system is a pain. You create a file set for a > directory and can then apply a quota to the file set. Even better you > can apply per file set user and group quotas. So if file set A has a > 1TB quota you could limit user X to 100GB in the file set, but outside > the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Greg.Lehmann at csiro.au Thu Aug 4 03:41:55 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 4 Aug 2016 02:41:55 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation Message-ID: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Thu Aug 4 09:13:29 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Thu, 4 Aug 2016 10:13:29 +0200 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: <57A2F929.8000003@ugent.be> This is new, it is explained how they are merged at http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1xx_soc.htm Cheers! K On 04/08/16 04:41, Greg.Lehmann at csiro.au wrote: > > I see only 4 pdfs now with slightly different titles to the previous 5 > pdfs available with 4.2.0. Just checking there are only supposed to be > 4 now? > > Greg > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Thu Aug 4 09:13:51 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:13:51 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Thu Aug 4 09:17:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:17:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: Ah. Dependent vs independent. (10,000 and 1000 respectively). -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 04 August 2016 09:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From st.graf at fz-juelich.de Thu Aug 4 09:20:42 2016 From: st.graf at fz-juelich.de (Stephan Graf) Date: Thu, 4 Aug 2016 10:20:42 +0200 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <57A2FADA.1060508@fz-juelich.de> Hi! I have tested it with dependent filesets in GPFS 4.1.1.X and there the limit is 10.000. Stephan On 08/04/16 10:13, Sobey, Richard A wrote: > 1000 isn't it?! We've always worked on that assumption. > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard > Sent: 03 August 2016 17:44 > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? > in the file set, but outside the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > -- Stephan Graf Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Phone: +49-2461-61-6578 Fax: +49-2461-61-6656 E-mail: st.graf at fz-juelich.de WWW: http://www.fz-juelich.de/jsc/ ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From daniel.kidger at uk.ibm.com Thu Aug 4 09:22:36 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 4 Aug 2016 08:22:36 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? GregUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 16:59:31 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 11:59:31 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" : > Quoting "Buterbaugh, Kevin L" : > >> Hi Sven, >> >> Wait - am I misunderstanding something here? Let?s say that I have >> ?user1? who has primary group ?group1? and secondary group >> ?group2?. And let?s say that they write to a directory where the >> bit on the directory forces all files created in that directory to >> have group2 associated with them. Are you saying that those >> files still count against group1?s group quota??? >> >> Thanks for clarifying? >> >> Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > >> >> On Aug 3, 2016, at 11:35 AM, Sven Oehme >> > wrote: >> >> Hi, >> >> quotas are only counted against primary group >> >> sven >> >> >> On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto >> > wrote: >> Suppose I want to set both USR and GRP quotas for a user, however >> GRP is not the primary group. Will gpfs enforce the secondary group >> quota for that user? >> >> What I mean is, if the user keeps writing files with secondary >> group as the attribute, and that overall group quota is reached, >> will that user be stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - >> www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 4 17:08:30 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 4 Aug 2016 16:08:30 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Message-ID: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Hi Jaime, Thank you sooooo much for doing this and reporting back the results! They?re in line with what I would expect to happen. I was going to test this as well, but we have had to extend our downtime until noontime tomorrow, so I haven?t had a chance to do so yet. Now I don?t have to? ;-) Kevin On Aug 4, 2016, at 10:59 AM, Jaime Pinto > wrote: Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" >: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 17:34:09 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 12:34:09 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Message-ID: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 10 22:00:26 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 10 Aug 2016 21:00:26 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 10 22:04:11 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 10 Aug 2016 21:04:11 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? In-Reply-To: References: Message-ID: <95126B16-B4DB-4406-862B-AA81E37F04E6@nuance.com> We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Buterbaugh, Kevin L" Reply-To: gpfsug main discussion list Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Wed Aug 10 22:43:15 2016 From: malone12 at illinois.edu (Maloney, John Daniel) Date: Wed, 10 Aug 2016 21:43:15 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: <4AD486D7-D452-465A-85EC-1BDDE2C5DCFD@illinois.edu> Hi Bob, Thanks for the update! The couple storage folks from NCSA going to SC16 won?t be available Sunday (I?m not able to get in until Monday morning). Agree completely there is never a perfect time, just giving our feedback. Thanks again, J.D. Maloney Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: > on behalf of "Oesterlin, Robert" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:04 PM To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] User group meeting at SC16? We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Buterbaugh, Kevin L" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Aug 11 05:47:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 11 Aug 2016 00:47:17 -0400 Subject: [gpfsug-discuss] GPFS and SELinux Message-ID: Hi Everyone, I'm passing this along on behalf of one of our security guys. Just wondering what feedback/thoughts others have on the topic. Current IBM guidance on GPFS and SELinux indicates that the default context for services (initrc_t) is insufficient for GPFS operations. See: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/Using+GPFS+with+SElinux That part is true (by design), but IBM goes further to say use runcon out of rc.local and configure the gpfs service to not start via init. I believe these latter two (rc.local/runcon and no-init) can be addressed, relatively trivially, through the application of a small selinux policy. Ideally, I would hope for IBM to develop, test, and send out the policy, but I'm happy to offer the following suggestions. I believe "a)" could be developed in a relatively short period of time. "b)" would take more time, effort and experience. a) consider SELinux context transition. As an example, consider: https://github.com/TresysTechnology/refpolicy/tree/master/policy/modules/services (specifically, the ssh components) On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, and runs under sshd_t Referencing ssh.te, you see several references to sshd_exec_t in: domtrans_pattern init_daemon_domain daemontools_service_domain (and so on) These configurations allow init to fire sshd off, setting its runtime context to sshd_t, based on the file context of sshd_exec_t. This should be duplicable for the gpfs daemon, altho I note it seems to be fired through a layer of abstraction in mmstartup. A simple policy that allows INIT to transition GPFS to unconfined_t would go a long way towards easing integration. b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when executed, would pick up a context of gpfs_t? Which then could be mapped through standard SELinux policy to allow access to configuration files (gpfs_etc_t?), block devices, etc? I admit, in b, I am speculating heavily. -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From janfrode at tanso.net Thu Aug 11 10:54:27 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 11 Aug 2016 11:54:27 +0200 Subject: [gpfsug-discuss] GPFS and SELinux In-Reply-To: References: Message-ID: I believe the runcon part is no longer necessary, at least on my RHEL7 based systems mmfsd is running unconfined by default: [root at flexscale01 ~]# ps -efZ|grep mmfsd unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 18018 17709 0 aug.05 ? 00:24:53 /usr/lpp/mmfs/bin/mmfsd and I've never seen any problems with that for base GPFS. I suspect doing a proper selinux domain for GPFS will be quite close to unconfined, so maybe not worth the effort... -jf On Thu, Aug 11, 2016 at 6:47 AM, Aaron Knister wrote: > Hi Everyone, > > I'm passing this along on behalf of one of our security guys. Just > wondering what feedback/thoughts others have on the topic. > > > Current IBM guidance on GPFS and SELinux indicates that the default > context for services (initrc_t) is insufficient for GPFS operations. > > See: > https://www.ibm.com/developerworks/community/wikis/home? > lang=en#!/wiki/General+Parallel+File+System+(GPFS)/ > page/Using+GPFS+with+SElinux > > > That part is true (by design), but IBM goes further to say use runcon > out of rc.local and configure the gpfs service to not start via init. > > I believe these latter two (rc.local/runcon and no-init) can be > addressed, relatively trivially, through the application of a small > selinux policy. > > Ideally, I would hope for IBM to develop, test, and send out the policy, > but I'm happy to offer the following suggestions. I believe "a)" could > be developed in a relatively short period of time. "b)" would take more > time, effort and experience. > > a) consider SELinux context transition. > > As an example, consider: > https://github.com/TresysTechnology/refpolicy/tree/master/ > policy/modules/services > > > (specifically, the ssh components) > > On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, > and runs under sshd_t > > Referencing ssh.te, you see several references to sshd_exec_t in: > domtrans_pattern > init_daemon_domain > daemontools_service_domain > (and so on) > > These configurations allow init to fire sshd off, setting its runtime > context to sshd_t, based on the file context of sshd_exec_t. > > This should be duplicable for the gpfs daemon, altho I note it seems to > be fired through a layer of abstraction in mmstartup. > > A simple policy that allows INIT to transition GPFS to unconfined_t > would go a long way towards easing integration. > > b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when > executed, would pick up a context of gpfs_t? Which then could be mapped > through standard SELinux policy to allow access to configuration files > (gpfs_etc_t?), block devices, etc? > > I admit, in b, I am speculating heavily. > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Fri Aug 12 20:40:27 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Fri, 12 Aug 2016 19:40:27 +0000 Subject: [gpfsug-discuss] HPCwire Readers Choice Message-ID: Reminder... Get your stories in today! To view this email in your browser, click here. Last Call for Readers' Choice Award Nominations! Deadline: Friday, August 12th at 11:50pm! Only 3 days left until nominations for the 2016 HPCwire Readers' Choice Awards come to a close! Be sure to submit your picks for the best in HPC and make your voice heard before it's too late! These annual awards are a way for our community to recognize the best and brightest innovators within the global HPC community. Time is running out for you to nominate what you think are the greatest achievements in HPC for 2016, so cast your ballot today! The 2016 Categories Include the Following: * Best Use of HPC Application in Life Sciences * Best Use of HPC Application in Manufacturing * Best Use of HPC Application in Energy (previously 'Oil and Gas') * Best Use of HPC in Automotive * Best Use of HPC in Financial Services * Best Use of HPC in Entertainment * Best Use of HPC in the Cloud * Best Use of High Performance Data Analytics * Best Implementation of Energy-Efficient HPC * Best HPC Server Product or Technology * Best HPC Storage Product or Technology * Best HPC Software Product or Technology * Best HPC Visualization Product or Technology * Best HPC Interconnect Product or Technology * Best HPC Cluster Solution or Technology * Best Data-Intensive System (End-User Focused) * Best HPC Collaboration Between Government & Industry * Best HPC Collaboration Between Academia & Industry * Top Supercomputing Achievement * Top 5 New Products or Technologies to Watch * Top 5 Vendors to Watch * Workforce Diversity Leadership Award * Outstanding Leadership in HPC Nominations are accepted from readers, users, vendors - virtually anyone who is connected to the HPC community and is a reader of HPCwire. Nominations will close on August 12, 2016 at 11:59pm. Make your voice heard! Help tell the story of HPC in 2016 by submitting your nominations for the HPCwire Readers' Choice Awards now! Nominations close on August 12, 2016. All nominations are subject to review by the editors of HPCwire with only the most relevant being accepted. Voting begins August 22, 2015. The final presentation of these prestigious and highly anticipated awards to each organization's leading executives will take place live during SC '16 in Salt Lake City, UT. The finalist(s) in each category who receive the most votes will win this year's awards. Open to HPCwire readers only. HPCwire Subscriber Services This email was sent to lwestoby at us.ibm.com. You are receiving this email message as an HPCwire subscriber. To forward this email to a friend, click here. Unsubscribe from this list. Copyright ? 2016 Tabor Communications Inc. All rights reserved. 8445 Camino Santa Fe San Diego, California 92121 P: 858.625.0070 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 40078 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 5880 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Aug 15 10:59:34 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 09:59:34 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they're on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 12:22:31 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 11:22:31 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Aug 15 13:45:25 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 15 Aug 2016 12:45:25 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 15 13:58:47 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 12:58:47 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Aug 15 14:02:13 2016 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 15 Aug 2016 13:02:13 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: , <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 14:05:01 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 13:05:01 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: <28479088-C492-4441-A761-F49E1556E13E@nuance.com> PTF = Program Temporary Fix. IBM-Speak for a fix for a particular problem. Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 7:58 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kdball at us.ibm.com Mon Aug 15 15:12:07 2016 From: kdball at us.ibm.com (Keith D Ball) Date: Mon, 15 Aug 2016 14:12:07 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 55, Issue 16 In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jake.carroll at uq.edu.au Mon Aug 15 22:08:58 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Mon, 15 Aug 2016 21:08:58 +0000 Subject: [gpfsug-discuss] More on AFM cache chaining Message-ID: <94AB3BCD-B551-4F3E-9128-65B582A4ABC6@uq.edu.au> Hi there. In the spirit of a conversation a friend showed me a couple of weeks ago from Radhika Parameswaran and Luke Raimbach, we?re doing something similar to Luke (kind of), or at least attempting it, in regards to cache chaining. We?ve got a large research storage platform in Brisbane, Queensland, Australia and we?re trying to leverage a few different modes of operation. Currently: Cache A (IW) connects to what would be a Home (B) which then is effectively an NFS mount to (C) a DMF based NFS export. To a point, this works. It kind of allows us to use ?home? as the ultimate sink, and data migration in and out of DMF seems to be working nicely when GPFS pulls things from (B) which don?t appear to currently be in (A) due to policy, or a HWM was hit (thus emptying cache). We?ve tested it as far out as the data ONLY being offline in tape media inside (C) and it still works, cleanly coming back to (A) within a very reasonable time-frame. ? We hit ?problem 1? which is in and around NFS v4 ACL?s which aren?t surfacing or mapping correctly (as we?d expect). I guess this might be the caveat of trying to backend the cache to a home and have it sitting inside DMF (over an NFS Export) for surfacing of the data for clients. Where we?d like to head: We haven?t seen it yet, but as Luke and Radhika were discussing last month, we really liked the idea of an IW Cache (A, where instruments dump huge data) which then via AFM ends up at (B) (might also be technically ?home? but IW) which is then also a function of (C) which might also be another cache that sits next to a HPC platform for reading and writing data into quickly and out of in parallel. We like the idea of chained caches because it gives us extremely flexibility in the premise of our ?Data anywhere? fabric. We appreciate that this has some challenges, in that we know if you?ve got multiple IW scenarios the last write will always win ? this we can control with workload guidelines. But we?d like to add our voices to this idea of having caches chained all the way back to some point such that data is being pulled all the way from C --> B --> A and along the way, inflection points of IO might be written and read at point C and point B AND point A such that everyone would see the distribution and consistent data in the end. We?re also working on surfacing data via object and file simultaneously for different needs. This is coming along relatively well, but we?re still learning about where and where this does not make sense so far. A moving target, from how it all appears on the surface. Some might say that is effectively asking for a globally eventually (always) consistent filesystem within Scale?. Anyway ? just some thoughts. Regards, -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 03:22:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 15 Aug 2016 22:22:17 -0400 Subject: [gpfsug-discuss] mmfsadm test pit Message-ID: I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 16 16:21:13 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:21:13 -0700 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Light Weight Event support is not fully baked yet, and thus not documented. It's getting there. yuri From: "Daniel Kidger" To: "gpfsug main discussion list" , Cc: "gpfsug-discuss" Date: 08/04/2016 01:23 AM Subject: Re: [gpfsug-discuss] 4.2.1 documentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From volobuev at us.ibm.com Tue Aug 16 16:42:33 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:42:33 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca><20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca><20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca><7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> Message-ID: This is a long discussion thread, touching on several related subjects, but as far as the original "secondary groups" question, things are quite simple. A file in a Unix file system has an owning user and an owning group. Those are two IDs that are stored in the inode on disk, and those IDs are used to charge the corresponding user and group quotas. Exactly how the owning GID gets set is an entirely separate question. It may be the current user's primary group, or a secondary group, or a result of chown, etc. To GPFS code it doesn't matter what supplementary GIDs a given thread has in its security context for the purposes of charging group quota, the only thing that matters is the GID in the file inode. yuri From: "Jaime Pinto" To: "gpfsug main discussion list" , "Buterbaugh, Kevin L" , Date: 08/04/2016 09:34 AM Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? Sent by: gpfsug-discuss-bounces at spectrumscale.org OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > < mailto:pinto at scinet.utoronto.ca>> > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca< http://www.scinet.utoronto.ca/> - > www.computecanada.org< http://www.computecanada.org/> > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Tue Aug 16 16:59:13 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 16 Aug 2016 15:59:13 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up Message-ID: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Tue Aug 16 17:13:17 2016 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Tue, 16 Aug 2016 16:13:17 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up In-Reply-To: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> References: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> Message-ID: <57c145ab-4207-7550-af57-ff07d6ac8f2d@mdanderson.org> I am speaking: SNP-2408 : Implementing a Research Storage Environment Using IBM Spectrum Software at MD Anderson Cancer Center Program : Enabling Cognitive IT with Storage and Software Defined Solutions Track : Building Oceans of Data Session Type : Breakout Session Date/Time : Tue, 20-Sep, 05:00 PM-06:00 PM Location : MGM Grand - Room 104 Presenter(s):Jonathan Fosburgh, UT MD Anderson This is primarily dealing with Scale and Archive, and also includes Protect. -- Jonathan Fosburgh Principal Application Systems Analyst Storage Team IT Operations jfosburg at mdanderson.org (713) 745-9346 On 08/16/2016 10:59 AM, Oesterlin, Robert wrote: For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Aug 16 22:09:35 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 16 Aug 2016 17:09:35 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: I was surprised to read that Ctrl-C did not really kill restripe. It's supposed to! If it doesn't that's a bug. I ran this by my expert within IBM and he wrote to me: First of all a "PIT job" such as restripe, deldisk, delsnapshot, and such should be easy to stop by ^C the management program that started them. The SG manager daemon holds open a socket to the client program for the purposes of sending command output, progress updates, error messages and the like. The PIT code checks this socket periodically and aborts the PIT process cleanly if the socket is closed. If this cleanup doesn't occur, it is a bug and should be worth reporting. However, there's no exact guarantee on how quickly each thread on the SG mgr will notice and then how quickly the helper nodes can be stopped and so forth. The interval between socket checks depends among other things on how long it takes to process each file, if there are a few very large files, the delay can be significant. In the limiting case, where most of the FS storage is contained in a few files, this mechanism doesn't work [elided] well. So it can be quite involved and slow sometimes to wrap up a PIT operation. The simplest way to determine if the command has really stopped is with the mmdiag --commands issued on the SG manager node. This shows running commands with the command line, start time, socket, flags, etc. After ^Cing the client program, the entry here should linger for a while, then go away. When it exits you'll see an entry in the GPFS log file where it fails with err 50. If this doesn't stop the command after a while, it is worth looking into. If the command wasn't issued on the SG mgr node and you can't find the where the client command is running, the socket is still a useful hint. While tedious, it should be possible to trace this socket back to node where that command was originally run using netstat or equivalent. Poking around inside a GPFS internaldump will also provide clues; there should be an outstanding sgmMsgSGClientCmd command listed in the dump tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or similar. I'd like to warn the OP away from mmfsadm test pit. These commands are of course unsupported and unrecommended for any purpose (even internal test and development purposes, as far as I know). You are definitely working without a net there. When I was improving the integration between PIT and snapshot quiesce a few years ago, I looked into this and couldn't figure out how to (easily) make these stop and resume commands safe to use, so as far as I know they remain unsafe. The list command, however, is probably fairly okay; but it would probably be better to use mmfsadm saferdump pit. From: Aaron Knister To: Date: 08/15/2016 10:49 PM Subject: [gpfsug-discuss] mmfsadm test pit Sent by: gpfsug-discuss-bounces at spectrumscale.org I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 22:55:19 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 16 Aug 2016 17:55:19 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: Thanks Marc! That's incredibly helpful info. I'll uh, not use the test pit command :) -Aaron On 8/16/16 5:09 PM, Marc A Kaplan wrote: > I was surprised to read that Ctrl-C did not really kill restripe. It's > supposed to! If it doesn't that's a bug. > > I ran this by my expert within IBM and he wrote to me: > > First of all a "PIT job" such as restripe, deldisk, delsnapshot, and > such should be easy to stop by ^C the management program that started > them. The SG manager daemon holds open a socket to the client program > for the purposes of sending command output, progress updates, error > messages and the like. The PIT code checks this socket periodically and > aborts the PIT process cleanly if the socket is closed. If this cleanup > doesn't occur, it is a bug and should be worth reporting. However, > there's no exact guarantee on how quickly each thread on the SG mgr will > notice and then how quickly the helper nodes can be stopped and so > forth. The interval between socket checks depends among other things on > how long it takes to process each file, if there are a few very large > files, the delay can be significant. In the limiting case, where most > of the FS storage is contained in a few files, this mechanism doesn't > work [elided] well. So it can be quite involved and slow sometimes to > wrap up a PIT operation. > > The simplest way to determine if the command has really stopped is with > the mmdiag --commands issued on the SG manager node. This shows running > commands with the command line, start time, socket, flags, etc. After > ^Cing the client program, the entry here should linger for a while, then > go away. When it exits you'll see an entry in the GPFS log file where > it fails with err 50. If this doesn't stop the command after a while, > it is worth looking into. > > If the command wasn't issued on the SG mgr node and you can't find the > where the client command is running, the socket is still a useful hint. > While tedious, it should be possible to trace this socket back to node > where that command was originally run using netstat or equivalent. > Poking around inside a GPFS internaldump will also provide clues; there > should be an outstanding sgmMsgSGClientCmd command listed in the dump > tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or > similar. > > I'd like to warn the OP away from mmfsadm test pit. These commands are > of course unsupported and unrecommended for any purpose (even internal > test and development purposes, as far as I know). You are definitely > working without a net there. When I was improving the integration > between PIT and snapshot quiesce a few years ago, I looked into this and > couldn't figure out how to (easily) make these stop and resume commands > safe to use, so as far as I know they remain unsafe. The list command, > however, is probably fairly okay; but it would probably be better to use > mmfsadm saferdump pit. > > > > > > From: Aaron Knister > To: > Date: 08/15/2016 10:49 PM > Subject: [gpfsug-discuss] mmfsadm test pit > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I just discovered this interesting gem poking at mmfsadm: > > test pit fsname list|suspend|status|resume|stop [jobId] > > There have been times where I've kicked off a restripe and either > intentionally or accidentally ctrl-c'd it only to realize that many > times it's disappeared into the ether and is still running. The only way > I've known so far to stop it is with a chgmgr. > > A far more painful instance happened when I ran a rebalance on an fs > w/more than 31 nsds using more than 31 pit workers and hit *that* fun > APAR which locked up access for a single filesystem to all 3.5k nodes. > We spent 48 hours round the clock rebooting nodes as jobs drained to > clear it up. I would have killed in that instance for a way to cancel > the PIT job (the chmgr trick didn't work). It looks like you might > actually be able to do this with mmfsadm, although how wise this is, I > do not know (kinda curious about that). > > Here's an example. I kicked off a restripe and then ctrl-c'd it on a > client node. Then ran these commands from the fs manager: > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 > debug: statusListP D40E2C70 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop > 785979015170 > debug: statusListP 0 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 > debug: statusListP D4013E70 > > ... some time passes ... > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > debug: statusListP 0 > > Interesting. > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Wed Aug 17 02:46:39 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 17 Aug 2016 01:46:39 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 17 12:45:04 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 17 Aug 2016 11:45:04 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Message-ID: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From volobuev at us.ibm.com Wed Aug 17 21:34:57 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Wed, 17 Aug 2016 13:34:57 -0700 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Message-ID: Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From SAnderson at convergeone.com Wed Aug 17 22:11:25 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Wed, 17 Aug 2016 21:11:25 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Message-ID: <1471468285737.63407@convergeone.com> ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 14134 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.jpg Type: image/jpeg Size: 2593 bytes Desc: image003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 11635 bytes Desc: image005.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.png Type: image/png Size: 11505 bytes Desc: image007.png URL: From YARD at il.ibm.com Thu Aug 18 00:11:52 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 02:11:52 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: <1471468285737.63407@convergeone.com> References: <1471468285737.63407@convergeone.com> Message-ID: Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From SAnderson at convergeone.com Thu Aug 18 02:51:38 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 01:51:38 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: References: <1471468285737.63407@convergeone.com>, Message-ID: <1471485097896.49269@convergeone.com> ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards ________________________________ Yaron Daniel 94 Em Ha'Moshavot Rd [cid:_1_0DDE2A700DDE24DC007F6D32C2258012] Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 1851 bytes Desc: ATT00001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00002.png Type: image/png Size: 14134 bytes Desc: ATT00002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00003.jpg Type: image/jpeg Size: 2593 bytes Desc: ATT00003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00004.png Type: image/png Size: 11635 bytes Desc: ATT00004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00005.png Type: image/png Size: 11505 bytes Desc: ATT00005.png URL: From YARD at il.ibm.com Thu Aug 18 04:56:50 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 06:56:50 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: So - the procedure you are asking related to Samba. Please check at redhat Site the process of upgrade Samba - u will need to backup the tdb files and restore them. But pay attention that the Samba ids will remain the same after moving to CES - please review the Authentication Section. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: gpfsug main discussion list Date: 08/18/2016 04:52 AM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Thu Aug 18 15:47:25 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 18 Aug 2016 14:47:25 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: on behalf of Yuri L Volobuev Reply-To: gpfsug main discussion list Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From bbanister at jumptrading.com Thu Aug 18 16:00:21 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 18 Aug 2016 15:00:21 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From mimarsh2 at vt.edu Thu Aug 18 16:15:38 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Thu, 18 Aug 2016 11:15:38 -0400 Subject: [gpfsug-discuss] NSD Server BIOS setting - snoop mode Message-ID: All, Is there any best practice or recommendation for the Snoop Mode memory setting for NSD Servers? Default is Early Snoop. On compute nodes, I am using Cluster On Die, which creates 2 NUMA nodes per processor. This setup has 2 x 16-core Broadwell processors in each NSD server. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmcpheeters at anl.gov Thu Aug 18 16:14:11 2016 From: gmcpheeters at anl.gov (McPheeters, Gordon) Date: Thu, 18 Aug 2016 15:14:11 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Got my vote - thanks Robert. Gordon McPheeters ALCF Storage (630) 252-6430 gmcpheeters at anl.gov On Aug 18, 2016, at 10:00 AM, Bryan Banister > wrote: Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 18:50:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 10:50:12 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From SAnderson at convergeone.com Thu Aug 18 19:11:02 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 18:11:02 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 18 20:05:03 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 18 Aug 2016 19:05:03 +0000 Subject: [gpfsug-discuss] Please ignore - debugging an issue Message-ID: Please ignore. I am working with the list admins on an issue and need to send an e-mail to the list to duplicate the problem. I apologize that this necessitates this e-mail to the list. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 20:43:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 12:43:50 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: There are a few points to consider here: CES uses Samba in cluster mode with ctdb. That means that the tdb database is shared through ctdb on all protocol nodes, and the internal format is slightly different since it contains additional information for tracking the cross-node status of the individual records. Spectrum Scale officially supports the autorid module for internal id mapping. That approach is different than the older idmap_tdb since it basically only has one record per AD domain, and not one record per user or group. This is known to scale better in environments where many users and groups require id mappings. The downside is that data from idmap_tdb cannot be directly imported. While not officially supported Spectrum Scale also ships the idmap_tdb module. You could configure authentication and internal id mapping on Spectrum Scale, and then overwrite the config manually to use the old idmap module (the idmap-range-size is required, but not relevant later on): mmuserauth service create ... --idmap-range 1000000-9000000 --idmap-range-size 100000 /usr/lpp/mmfs/bin/net conf setparm global 'idmap config * : backend' tdb mmdsh -N CesNodes systemctl restart gpfs-winbind mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush With the old Samba, export the idmap data to a file: net idmap dump > idmap-dump.txt And on a node running CES Samba import that data, and remove any old cached entries: /usr/lpp/mmfs/bin/net idmap restore idmap-dump.txt mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush Just to be clear: This is untested and if there is a problem with the id mapping in that configuration, it will likely be pointed to the unsupported configuration. The way to request this as an official feature would be through a RFE, although i cannot say whether that would be picked up by product management. Another option would be creating the id mappings in the Active Directory records or in a external LDAP server based on the old mappings, and point the CES Samba to that data. That would again be a supported configuration. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: Christof Schmitt/Tucson/IBM at IBMUS Cc: gpfsug main discussion list Date: 08/18/2016 11:11 AM Subject: RE: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From jez.tucker at gpfsug.org Thu Aug 18 20:57:00 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Thu, 18 Aug 2016 20:57:00 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces Message-ID: Hi all As the discussion group is a mailing list, it is possible that members can experience the list traffic being interpreted as spam. In such instances, you may experience better results if you whitelist the mailing list addresses or create a 'Not Spam' filter (E.G. gmail) gpfsug-discuss at spectrumscale.org gpfsug-discuss at gpfsug.org You can test that you can receive a response from the mailing list server by sending an email to: gpfsug-discuss-request at spectrumscale.org with the subject of: help Should you experience further trouble, please ping us at: gpfsug-discuss-owner at spectrumscale.org All the best, Jez From aaron.s.knister at nasa.gov Fri Aug 19 05:12:26 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:12:26 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 gpfs server nodes and we've done several rounds of rolling upgrades starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of both versions for quite some time (We're actually in that state right now as it would happen and have been for several months). I've not seen any issue with it. Of course, as Richard alluded to, its good to check the release notes :) -Aaron On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: > Richard, > > I will second what Bob said with one caveat ? on one occasion we had an > issue with our multi-cluster setup because the PTF?s were incompatible. > However, that was clearly documented in the release notes, which we > obviously hadn?t read carefully enough. > > While we generally do rolling upgrades over a two to three week period, > we have run for months with clients at differing PTF levels. HTHAL? > > Kevin > >> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >> > wrote: >> >> In general, yes, it's common practice to do the 'rolling upgrades'. If >> I had to do my whole cluster at once, with an outage, I'd probably >> never upgrade. :) >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> *From: *> > on behalf of >> "Sobey, Richard A" > > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Monday, August 15, 2016 at 4:59 AM >> *To: *"'gpfsug-discuss at spectrumscale.org >> '" >> > > >> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >> problems? >> >> Hi all, >> >> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >> it over a few days, perhaps up to 2 weeks or will I run into problems >> if they?re on different versions? >> >> Cheers >> >> Richard >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu > - (615)875-9633 > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Fri Aug 19 05:13:06 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:13:06 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Message-ID: <70e33e6d-cd6b-5a5e-1e2d-f0ad16def5f4@nasa.gov> Oops... I meant Kevin, not Richard. On 8/19/16 12:12 AM, Aaron Knister wrote: > Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 > gpfs server nodes and we've done several rounds of rolling upgrades > starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of > both versions for quite some time (We're actually in that state right > now as it would happen and have been for several months). I've not seen > any issue with it. Of course, as Richard alluded to, its good to check > the release notes :) > > -Aaron > > On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: >> Richard, >> >> I will second what Bob said with one caveat ? on one occasion we had an >> issue with our multi-cluster setup because the PTF?s were incompatible. >> However, that was clearly documented in the release notes, which we >> obviously hadn?t read carefully enough. >> >> While we generally do rolling upgrades over a two to three week period, >> we have run for months with clients at differing PTF levels. HTHAL? >> >> Kevin >> >>> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >>> > >>> wrote: >>> >>> In general, yes, it's common practice to do the 'rolling upgrades'. If >>> I had to do my whole cluster at once, with an outage, I'd probably >>> never upgrade. :) >>> >>> >>> Bob Oesterlin >>> Sr Storage Engineer, Nuance HPC Grid >>> >>> >>> *From: *>> > on behalf of >>> "Sobey, Richard A" >> > >>> *Reply-To: *gpfsug main discussion list >>> >> > >>> *Date: *Monday, August 15, 2016 at 4:59 AM >>> *To: *"'gpfsug-discuss at spectrumscale.org >>> '" >>> >> > >>> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >>> problems? >>> >>> Hi all, >>> >>> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >>> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >>> it over a few days, perhaps up to 2 weeks or will I run into problems >>> if they?re on different versions? >>> >>> Cheers >>> >>> Richard >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ? >> Kevin Buterbaugh - Senior System Administrator >> Vanderbilt University - Advanced Computing Center for Research and >> Education >> Kevin.Buterbaugh at vanderbilt.edu >> - (615)875-9633 >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bdeluca at gmail.com Fri Aug 19 05:15:00 2016 From: bdeluca at gmail.com (Ben De Luca) Date: Fri, 19 Aug 2016 07:15:00 +0300 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: Hey Jez, Its because the mailing list doesn't have an SPF record in your DNS, being neutral is a good way to be picked up as spam. On 18 August 2016 at 22:57, Jez Tucker wrote: > Hi all > > As the discussion group is a mailing list, it is possible that members can > experience the list traffic being interpreted as spam. > > > In such instances, you may experience better results if you whitelist the > mailing list addresses or create a 'Not Spam' filter (E.G. gmail) > > gpfsug-discuss at spectrumscale.org > > gpfsug-discuss at gpfsug.org > > > You can test that you can receive a response from the mailing list server by > sending an email to: gpfsug-discuss-request at spectrumscale.org with the > subject of: help > > > Should you experience further trouble, please ping us at: > gpfsug-discuss-owner at spectrumscale.org > > > All the best, > > > Jez > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jez.tucker at gpfsug.org Fri Aug 19 08:51:20 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Fri, 19 Aug 2016 08:51:20 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: <0c9d81b2-ac41-b6a5-e4f1-a816558711b7@gpfsug.org> Hi Yes, we looked at that some time ago and I recall we had an issues with setting up the SPF. However, probably a good time as any to look at it again. I'll ping Arif and Simon and they can look at their respective domains. Jez On 19/08/16 05:15, Ben De Luca wrote: > Hey Jez, > Its because the mailing list doesn't have an SPF record in your > DNS, being neutral is a good way to be picked up as spam. > > > > On 18 August 2016 at 22:57, Jez Tucker wrote: >> Hi all >> >> As the discussion group is a mailing list, it is possible that members can >> experience the list traffic being interpreted as spam. >> >> >> In such instances, you may experience better results if you whitelist the >> mailing list addresses or create a 'Not Spam' filter (E.G. gmail) >> >> gpfsug-discuss at spectrumscale.org >> >> gpfsug-discuss at gpfsug.org >> >> >> You can test that you can receive a response from the mailing list server by >> sending an email to: gpfsug-discuss-request at spectrumscale.org with the >> subject of: help >> >> >> Should you experience further trouble, please ping us at: >> gpfsug-discuss-owner at spectrumscale.org >> >> >> All the best, >> >> >> Jez >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From aaron.s.knister at nasa.gov Fri Aug 19 23:06:57 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 18:06:57 -0400 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Message-ID: <5ca238de-bb95-2854-68bd-36d1b8df2810@nasa.gov> Thanks everyone! I also have a PMR open for this, so hopefully the RFE gets some traction. On 8/18/16 11:14 AM, McPheeters, Gordon wrote: > Got my vote - thanks Robert. > > > Gordon McPheeters > ALCF Storage > (630) 252-6430 > gmcpheeters at anl.gov > > > >> On Aug 18, 2016, at 10:00 AM, Bryan Banister >> > wrote: >> >> Great stuff? I added my vote, >> -Bryan >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] *On >> Behalf Of *Oesterlin, Robert >> *Sent:* Thursday, August 18, 2016 9:47 AM >> *To:* gpfsug main discussion list >> *Subject:* Re: [gpfsug-discuss] Monitor NSD server queue? >> >> Done. >> >> Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) >> >> ID: 93260 >> Headline: Give sysadmin insight >> into the inner workings of the NSD server machinery, in particular the >> queue dynamics >> Submitted on: 18 Aug 2016, 10:46 AM Eastern >> Time (ET) >> Brand: Servers and Systems >> Software >> Product: Spectrum Scale (formerly >> known as GPFS) - Public RFEs >> >> Link: >> http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> 507-269-0413 >> >> >> *From: *> > on behalf of Yuri L >> Volobuev > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Wednesday, August 17, 2016 at 3:34 PM >> *To: *gpfsug main discussion list > > >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? >> >> >> Unfortunately, at the moment there's no safe mechanism to show the >> usage statistics for different NSD queues. "mmfsadm saferdump nsd" as >> implemented doesn't acquire locks when parsing internal data >> structures. Now, NSD data structures are fairly static, as much things >> go, so the risk of following a stale pointer and hitting a segfault >> isn't particularly significant. I don't think I remember ever seeing >> mmfsd crash with NSD dump code on the stack. That said, this isn't >> code that's tested and known to be safe for production use. I haven't >> seen a case myself where an mmfsd thread gets stuck running this dump >> command, either, but Bob has. If that condition ever reoccurs, I'd be >> interested in seeing debug data. >> >> I agree that there's value in giving a sysadmin insight into the inner >> workings of the NSD server machinery, in particular the queue >> dynamics. mmdiag should be enhanced to allow this. That'd be a very >> reasonable (and doable) RFE. >> >> yuri >> >> "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron >> You did a perfect job of explaining a situation I've run into time >> after time - high latenc >> >> From: "Oesterlin, Robert" > > >> To: gpfsug main discussion list > >, >> Date: 08/17/2016 04:45 AM >> Subject: Re: [gpfsug-discuss] Monitor NSD server queue? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> Hi Aaron >> >> You did a perfect job of explaining a situation I've run into time >> after time - high latency on the disk subsystem causing a backup in >> the NSD queues. I was doing what you suggested not to do - "mmfsadm >> saferdump nsd' and looking at the queues. In my case 'mmfsadm >> saferdump" would usually work or hang, rather than kill mmfsd. But - >> the hang usually resulted it a tied up thread in mmfsd, so that's no >> good either. >> >> I wish I had better news - this is the only way I've found to get >> visibility to these queues. IBM hasn't seen fit to gives us a way to >> safely look at these. I personally think it's a bug that we can't >> safely dump these structures, as they give insight as to what's >> actually going on inside the NSD server. >> >> Yuri, Sven - thoughts? >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> >> *From: *> > on behalf of >> "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" >> >* >> Reply-To: *gpfsug main discussion list >> > >* >> Date: *Tuesday, August 16, 2016 at 8:46 PM* >> To: *gpfsug main discussion list > >* >> Subject: *[EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? >> >> Hi Everyone, >> >> We ran into a rather interesting situation over the past week. We had >> a job that was pounding the ever loving crap out of one of our >> filesystems (called dnb02) doing about 15GB/s of reads. We had other >> jobs experience a slowdown on a different filesystem (called dnb41) >> that uses entirely separate backend storage. What I can't figure out >> is why this other filesystem was affected. I've checked IB bandwidth >> and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth >> congestion, looked at the mmpmon nsd_ds counters (including disk >> request wait time), and checked out the disk iowait values from >> collectl. I simply can't account for the slowdown on the other >> filesystem. The only thing I can think of is the high latency on >> dnb02's NSDs caused the mmfsd NSD queues to back up. >> >> Here's my question-- how can I monitor the state of th NSD queues? I >> can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the >> queues and their status. I'm just not sure calling saferdump NSD every >> 10 seconds to monitor this data is going to end well. I've seen >> saferdump NSD cause mmfsd to die and that's from a task we only run >> every 6 hours that calls saferdump NSD. >> >> Any thoughts/ideas here would be great. >> >> Thanks! >> >> -Aaron_______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> ------------------------------------------------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged >> information. If you are not the intended recipient, you are hereby >> notified that any review, dissemination or copying of this email is >> strictly prohibited, and to please notify the sender immediately and >> destroy this email and any attachments. Email transmission cannot be >> guaranteed to be secure or error-free. The Company, therefore, does >> not make any guarantees as to the completeness or accuracy of this >> email or any attachments. This email is for informational purposes >> only and does not constitute a recommendation, offer, request or >> solicitation of any kind to buy, sell, subscribe, redeem or perform >> any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From r.sobey at imperial.ac.uk Mon Aug 22 12:59:16 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 11:59:16 +0000 Subject: [gpfsug-discuss] CES and mmuserauth command Message-ID: Hi all, We're just about to start testing a new CES 4.2.0 cluster and at the stage of "joining" the cluster to our AD. What's the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I've re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I'm using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 22 14:28:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 13:28:01 +0000 Subject: [gpfsug-discuss] CES mmsmb options Message-ID: Related to my previous question in so far as it's to do with CES, what's this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static... for example log size / location / dmapi support? I'm surely missing something obvious. It's SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Tue Aug 23 00:30:10 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 22 Aug 2016 16:30:10 -0700 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Looks like there is a per export and a global listing. These are values that can be set per export : /usr/lpp/mmfs/bin/mmsmb export change --key-info supported Supported smb options with allowed values: admin users = any // any valid user browseable = yes, no comment = any // A free text description of the export. csc policy = manual, disable, documents, programs fileid:algorithm = fsname, hostname, fsname_nodirs, fsname_norootdir gpfs:leases = yes, no gpfs:recalls = yes, no gpfs:sharemodes = yes, no gpfs:syncio = yes, no hide unreadable = yes, no oplocks = yes, no posix locking = yes, no read only = yes, no smb encrypt = auto, default, mandatory, disabled syncops:onclose = yes, no These are the values that are set globally: /usr/lpp/mmfs/bin/mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 23 03:23:40 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Mon, 22 Aug 2016 22:23:40 -0400 Subject: [gpfsug-discuss] GPFS FPO Message-ID: Does anyone have any experiences to share (good or bad) about setting up and utilizing FPO for hadoop compute on top of GPFS? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 23 03:37:00 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 22 Aug 2016 22:37:00 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Yes, indeed. Note that these are my personal opinions. It seems to work quite well and it's not terribly hard to set up or get running. That said, if you've got a traditional HPC cluster with reasonably good bandwidth (and especially if your data is already on the HPC cluster) I wouldn't bother with FPO and just use something like magpie (https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on your traditional HPC cluster. I believe FPO (and by extension data locality) is important when the available bandwidth between your clients and servers/disks (in a traditional GPFS environment) is less than the bandwidth available within a node (e.g. between your local disks and the host CPU). -Aaron On 8/22/16 10:23 PM, Brian Marshall wrote: > Does anyone have any experiences to share (good or bad) about setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From mimarsh2 at vt.edu Tue Aug 23 12:56:22 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 23 Aug 2016 07:56:22 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Aaron, Do you have experience running this on native GPFS? The docs say Lustre and any NFS filesystem. Thanks, Brian On Aug 22, 2016 10:37 PM, "Aaron Knister" wrote: > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or get > running. That said, if you've got a traditional HPC cluster with reasonably > good bandwidth (and especially if your data is already on the HPC cluster) > I wouldn't bother with FPO and just use something like magpie ( > https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on > your traditional HPC cluster. I believe FPO (and by extension data > locality) is important when the available bandwidth between your clients > and servers/disks (in a traditional GPFS environment) is less than the > bandwidth available within a node (e.g. between your local disks and the > host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > >> Does anyone have any experiences to share (good or bad) about setting up >> and utilizing FPO for hadoop compute on top of GPFS? >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 23 13:15:24 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 23 Aug 2016 14:15:24 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: > Hi all, > > > > We?re just about to start testing a new CES 4.2.0 cluster and at the stage > of ?joining? the cluster to our AD. What?s the bare minimum we need to get > going with this? My Windows guy (who is more Linux but whatever) has > suggested the following: > > > > mmuserauth service create --type ad --data-access-method file > > --netbios-name store --user-name USERNAME --password > > --enable-nfs-kerberos --enable-kerberos > > --servers list,of,servers > > --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 > --unixmap-domains 'DOMAIN(500 - 2000000)' > > > > He has also asked what the following is: > > > > --idmap-role ??? > > --idmap-range-size ?? > > > > All our LDAP GID/UIDs are coming from a system outside of GPFS so do we > leave this blank, or say master Or, now I?ve re-read and mmuserauth page, > is this purely for when you have AFM relationships and one GPFS cluster > (the subordinate / the second cluster) gets its UIDs and GIDs from another > GPFS cluster (the master / the first one)? > > > > For idmap-range-size is this essentially the highest number of users and > groups you can have defined within Spectrum Scale? (I love how I?m using > GPFS and SS interchangeably.. forgive me!) > > > > Many thanks > > > > Richard > > > > > > Richard Sobey > > Storage Area Network (SAN) Analyst > Technical Operations, ICT > Imperial College London > South Kensington > 403, City & Guilds Building > London SW7 2AZ > Tel: +44 (0)20 7594 6915 > Email: r.sobey at imperial.ac.uk > http://www.imperial.ac.uk/admin-services/ict/ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Aug 23 14:58:17 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 13:58:17 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: In one of my file systems, I have some odd entries that seem to not be associated with a user - any ideas on the cause or how to track these down? This is a snippet from mmprepquota: Block Limits | File Limits Name type KB quota limit in_doubt grace | files quota limit in_doubt grace 2751555824 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348898617 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348895209 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 1610682073 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 536964752 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 403325529 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:06:50 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:06:50 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: References: Message-ID: <1471961210.30100.88.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 13:58 +0000, Oesterlin, Robert wrote: > In one of my file systems, I have some odd entries that seem to not be > associated with a user - any ideas on the cause or how to track these > down? This is a snippet from mmprepquota: > > > > Block Limits > | File Limits > > Name type KB quota limit in_doubt > grace | files quota limit in_doubt grace > > 2751555824 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348898617 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348895209 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 1610682073 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 536964752 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 403325529 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Robert.Oesterlin at nuance.com Tue Aug 23 15:10:22 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 14:10:22 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Well - good idea, but these large numbers in no way reflect valid ID numbers in our environment. Wondering how they got there? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of Jonathan Buzzard Reply-To: gpfsug main discussion list Date: Tuesday, August 23, 2016 at 9:06 AM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] Re: [gpfsug-discuss] Odd entries in quota listing I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:16:05 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:16:05 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> References: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Message-ID: <1471961765.30100.90.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 14:10 +0000, Oesterlin, Robert wrote: > Well - good idea, but these large numbers in no way reflect valid ID > numbers in our environment. Wondering how they got there? > I was guessing generating UID's from Windows RID's? Alternatively some script generated them automatically and the UID's are bogus. You can create a quota for any random UID and GPFS won't complain. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From aaron.s.knister at nasa.gov Wed Aug 24 17:43:56 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Wed, 24 Aug 2016 12:43:56 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: <6f5a7284-c910-bbda-5e53-7f78e4289ad9@nasa.gov> To tell you the truth, I don't. It's on my radar but I haven't done it yet. I *have* run hadoop on GPFS w/o magpie though and on only a couple of nodes was able to pound 1GB/s out to GPFS w/ the terasort benchmark. I know our GPFS FS can go much faster than that but java was cpu-bound as it often seems to be. -Aaron On 8/23/16 7:56 AM, Brian Marshall wrote: > Aaron, > > Do you have experience running this on native GPFS? The docs say Lustre > and any NFS filesystem. > > Thanks, > Brian > > > On Aug 22, 2016 10:37 PM, "Aaron Knister" > wrote: > > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or > get running. That said, if you've got a traditional HPC cluster with > reasonably good bandwidth (and especially if your data is already on > the HPC cluster) I wouldn't bother with FPO and just use something > like magpie (https://github.com/LLNL/magpie > ) to run your hadoopy workload on > GPFS on your traditional HPC cluster. I believe FPO (and by > extension data locality) is important when the available bandwidth > between your clients and servers/disks (in a traditional GPFS > environment) is less than the bandwidth available within a node > (e.g. between your local disks and the host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > > Does anyone have any experiences to share (good or bad) about > setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From SAnderson at convergeone.com Thu Aug 25 17:32:48 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 25 Aug 2016 16:32:48 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752@convergeone.com> ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:47:00 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:47:00 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:50:20 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:50:20 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BFD5@CHI-EXCHANGEW1.w2k.jumptrading.com> I realize this was totally tangential to your question. Sorry I can?t help with the syntax, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Thursday, August 25, 2016 11:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Thu Aug 25 17:55:44 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Thu, 25 Aug 2016 09:55:44 -0700 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: Not sure where mmcessmbchconfig command is coming from? mmsmb is the proper CLI syntax [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb Usage: mmsmb export Administer SMB exports. mmsmb exportacl Administer SMB export ACLs. mmsmb config Administer SMB global configuration. [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb export -h Usage: mmsmb export list List SMB exports. mmsmb export add Add SMB exports. mmsmb export change Change SMB exports. mmsmb export remove Remove SMB exports. [root at smaug-vm1 installer]# man mmsmb http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_mmsmb.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweil at wustl.edu Thu Aug 25 19:50:52 2016 From: mweil at wustl.edu (Matt Weil) Date: Thu, 25 Aug 2016 13:50:52 -0500 Subject: [gpfsug-discuss] Backup on object stores Message-ID: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From billowen at us.ibm.com Thu Aug 25 20:55:33 2016 From: billowen at us.ibm.com (Bill Owen) Date: Thu, 25 Aug 2016 12:55:33 -0700 Subject: [gpfsug-discuss] Backup on object stores In-Reply-To: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> References: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Message-ID: Hi Matt, With Spectrum Scale object storage, you can create storage policies, and then assign containers to those policies. Each policy will map to a GPFS independent fileset. That way, you can subdivide object storage and manage different types of objects based on the type of data stored in the container/storage policy (i.e., back up some types of object data nightly, some weekly, some not at all). Today, we don't have a cli to simplify to restoring individual objects. But using commands like swift-get-nodes, you can determine the filesystem path to an object, and then restore only that item. And if you are using storage policies with file & object access enabled, you can access the object/files by file path directly. Regards, Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 From: Matt Weil To: Date: 08/25/2016 11:51 AM Subject: [gpfsug-discuss] Backup on object stores Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Fri Aug 26 00:14:57 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 25 Aug 2016 23:14:57 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <156b078bfb2d48d8b77d5250dba7e928@exch1-cdc.nexus.csiro.au> I agree with an RFE. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Friday, 26 August 2016 2:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From syi at ca.ibm.com Fri Aug 26 00:15:46 2016 From: syi at ca.ibm.com (Yi Sun) Date: Thu, 25 Aug 2016 19:15:46 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: You may check mmsmb command, not sure if it is what you look for. https://www.ibm.com/support/knowledgecenter/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.adm.doc/bl1adm_mmsmb.htm#mmsmb ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- From: Shaun Anderson To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752 at convergeone.com> Content-Type: text/plain; charset="iso-8859-1" ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: To clarify and expand on some of these: --servers takes the AD Domain Controller that is contacted first during configuration. Later and during normal operations the list of DCs is retrieved from DNS and the fastest (or closest one according to the AD sites) is used. The initially one used does not have a special role. --idmap-role allows dedicating one cluster as a master, and a second cluster (e.g. a AFM replication target) as "subordinate". Only the master will allocate idmap ranges which can then be imported to the subordiate to have consistent id mappings. --idmap-range-size and --idmap-range are used for the internal idmap allocation which is used for every domain that is not explicitly using another domain. "man idmap_autorid" explains the approach taken. As long as the default does not overlap with any other ids, that can be used. The "netbios" name is used to create the machine account for the cluster when joining the AD domain. That is how the AD administrator will identify the CES cluster. It is also important in SMB deployments when Kerberos should be used with SMB: The same names as the netbios name has to be defined in DNS for the public CES IP addresses. When the name matches, then SMB clients can acquire a Kerberos ticket from AD to establish a SMB connection. When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/23/2016 08:15 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: Hi all, We?re just about to start testing a new CES 4.2.0 cluster and at the stage of ?joining? the cluster to our AD. What?s the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I?ve re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I?m using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: The mmcessmb* commands are scripts that are run from the corresponding mmsmb subcommands. mmsmb is documented and should be used instead of calling the mmcesmb* scripts directly. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/25/2016 12:33 PM Subject: [gpfsug-discuss] mmcessmbchconfig command Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:52:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:52:50 -0400 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From gaurang.tapase at in.ibm.com Fri Aug 26 08:53:12 2016 From: gaurang.tapase at in.ibm.com (Gaurang Tapase) Date: Fri, 26 Aug 2016 13:23:12 +0530 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Message-ID: Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.org or the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series , Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers , IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning , Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object , Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency , Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Fri Aug 26 09:17:55 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:17:55 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Thanks Christof, and for the detailed posting on the mmuserauth settings. I do not know why we have changed dmapi support in our existing smb.conf, but perhaps it was for some legacy stuff. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: 26 August 2016 00:53 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES mmsmb options The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Fri Aug 26 09:48:24 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:48:24 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: Sorry all, prepare for a deluge of emails like this, hopefully it'll help other people implementing CES in the future. I'm trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it's not running but it seems to be blocking me. It's happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Aug 26 10:48:18 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 09:48:18 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That was a weird one :-) Don't understand why NFS would block smb.., and I don't see that on my cluster. Would it make sense to suspend the node instead? As a workaround. mmces node suspend -jf fre. 26. aug. 2016 kl. 10.48 skrev Sobey, Richard A : > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > > > I?m trying to stop SMB on a node, but getting the following output: > > > > [root at cesnode ~]# mmces service stop smb > > smb: Request denied. Please stop NFS first > > > > [root at cesnode ~]# mmces service list > > Enabled services: SMB > > SMB is running > > > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From konstantin.arnold at unibas.ch Fri Aug 26 10:56:28 2016 From: konstantin.arnold at unibas.ch (Konstantin Arnold) Date: Fri, 26 Aug 2016 11:56:28 +0200 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: <57C0124C.7050404@unibas.ch> Hi Richard, I ran into the same issue and asked if 'systemctl reload gpfs-smb.service' would work? I got the following answer: "... Now in regards to your question about stopping NFS, yes this is an expected behavior and yes you could also restart through systemctl." Maybe that helps. Konstantin From janfrode at tanso.net Fri Aug 26 10:59:34 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 11:59:34 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From dacalder at co.ibm.com Sat Aug 27 13:52:44 2016 From: dacalder at co.ibm.com (Danny Alexander Calderon Rodriguez) Date: Sat, 27 Aug 2016 12:52:44 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: Message-ID: Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone > El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: > > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) > 2. Re: CES and mmuserauth command (Christof Schmitt) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Message-ID: > > > Content-Type: text/plain; charset="UTF-8" > > That would be the case when Active Directory is configured for > authentication. In that case the SMB service includes two aspects: One is > the actual SMB file server, and the second one is the service for the > Active Directory integration. Since NFS depends on authentication and id > mapping services, it requires SMB to be running. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > > Date: 08/26/2016 04:48 AM > Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > I?m trying to stop SMB on a node, but getting the following output: > > [root at cesnode ~]# mmces service stop smb > smb: Request denied. Please stop NFS first > > [root at cesnode ~]# mmces service list > Enabled services: SMB > SMB is running > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > ------------------------------ > > Message: 2 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Message-ID: > > > Content-Type: text/plain; charset="ISO-2022-JP" > > The --user-name option applies to both, AD and LDAP authentication. In the > LDAP case, this information is correct. I will try to get some > clarification added for the AD case. > > The same applies to the information shown in "service list". There is a > common field that holds the information and the parameter from the initial > "service create" is stored there. The meaning is different for AD and > LDAP: For LDAP it is the username being used to access the LDAP server, > while in the AD case it was only the user initially used until the machine > account was created. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < > christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > > That was my initial understanding of the --user-name, but when reading the > man-page I get the impression that it's also used to do connect to AD to > do user and group lookups: > > ------------------------------------------------------------------------------------------------------ > ??user?name userName > Specifies the user name to be used to perform operations > against the authentication server. The specified user > name must have sufficient permissions to read user and > group attributes from the authentication server. > ------------------------------------------------------------------------------------------------------- > > Also it's strange that "mmuserauth service list" would list the USER_NAME > if it was only somthing that was used at configuration time..? > > > > -jf_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 55, Issue 44 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Sat Aug 27 20:06:45 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Sat, 27 Aug 2016 19:06:45 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: Hi, Thanks for the info! I think I?ll perform an upgrade to 4.2.1, the cluster is still in a pre-production state and I?ve yet to really start testing client access. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Danny Alexander Calderon Rodriguez Sent: 27 August 2016 13:53 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) 2. Re: CES and mmuserauth command (Christof Schmitt) ---------------------------------------------------------------------- Message: 1 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: > Content-Type: text/plain; charset="UTF-8" That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ Message: 2 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command Message-ID: > Content-Type: text/plain; charset="ISO-2022-JP" The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 44 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Mon Aug 29 00:57:21 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 28 Aug 2016 23:57:21 +0000 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <57496841ec784222b5e291a921280c38@exch1-cdc.nexus.csiro.au> It would be nice if the Spectrum Scale User Group website had links to these, perhaps a separate page for blogs links. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Gaurang Tapase Sent: Friday, 26 August 2016 5:53 PM To: gpfsug main discussion list Cc: Sandeep Ramesh Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.orgor the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series, Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers, IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning, Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object, Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency, Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Mon Aug 29 06:34:03 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Sun, 28 Aug 2016 22:34:03 -0700 Subject: [gpfsug-discuss] Edge Attendees Message-ID: Greetings: I am organizing an NDA round-table with the IBM Offering Managers at IBM Edge on Tuesday, September 20th at 1pm. The subject will be "The Future of IBM Spectrum Scale." IBM Offering Managers are the Product Owners at IBM. There will be discussions covering licensing, the roadmap for IBM Spectrum Scale RAID (aka GNR), new hardware platforms, etc. This is a unique opportunity to get feedback to the drivers of the IBM Spectrum Scale business plans. It should be a great companion to the content we get from Engineering and Research at most User Group meetings. To get an invitation, please email me privately at douglasof us.ibm.com. All who have a valid NDA are invited. I only need an approximate headcount of attendees. Try not to spam the mailing list. I am pushing to get the Offering Managers to have a similar session at SC16 as an IBM Multi-client Briefing. You can add your voice to that call on this thread, or email me directly. Spectrum Scale User Group at SC16 will once again take place on Sunday afternoon with cocktails to follow. I hope we can blow out the attendance numbers and the number of site speakers we had last year! I know Simon, Bob, and Kristy are already working the agenda. Get your ideas in to them or to me. See you in Vegas, Vegas, SLC, Vegas this Fall... Maybe Australia in between? doug Douglas O'Flaherty IBM Spectrum Storage Marketing -------------- next part -------------- An HTML attachment was scrubbed... URL: From stef.coene at docum.org Mon Aug 29 07:39:05 2016 From: stef.coene at docum.org (Stef Coene) Date: Mon, 29 Aug 2016 08:39:05 +0200 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <9bb8d52e-86a3-3ff7-daaf-dc6bf0a3bd82@docum.org> Hi, When trying to register on the website, I each time get the error: "Session expired. Please try again later." Stef From kraemerf at de.ibm.com Mon Aug 29 08:20:46 2016 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Mon, 29 Aug 2016 09:20:46 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Hi all, In the last months several customers were asking for the option to use multiple IBM Spectrum Protect servers to protect a single IBM Spectrum Scale file system. Some of these customer reached the server scalability limits, others wanted to increase the parallelism of the server housekeeping processes. In consideration of the significant grow of data it can be assumed that more and more customers will be faced with this challenge in the future. Therefore, this paper was written that helps to address this situation. This paper describes the setup and configuration of multiple IBM Spectrum Protect servers to be used to store backup and hsm data of a single IBM Spectrum Scale file system. Beside the setup and configuration several best practices were written to the paper that help to simplify the daily use and administration of such environments. Find the paper here: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection A big THANK YOU goes to my co-writers Thomas Schreiber and Patrick Luft for their important input and all the tests (...and re-tests and re-tests and re-tests :-) ) they did. ...please share in your communities. Greetings, Dominic. ______________________________________________________________________________________________________________ Dominic Mueller-Wicke | IBM Spectrum Protect Development | Technical Lead | +49 7034 64 32794 | dominic.mueller at de.ibm.com Vorsitzende des Aufsichtsrats: Martina Koederitz; Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen; Registergericht: Amtsgericht Stuttgart, HRB 243294 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 18:33:59 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:33:59 -0400 Subject: [gpfsug-discuss] iowait? Message-ID: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Hi Everyone, Would it be easy to have GPFS report iowait values in linux? This would be a huge help for us in determining whether a node's low utilization is due to some issue with the code running on it or if it's blocked on I/O, especially in a historical context. I naively tried on a test system changing schedule() in cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: again: /* call the scheduler */ if ( waitFlags & INTERRUPTIBLE ) schedule(); else io_schedule(); Seems to actually do what I'm after but generally bad things happen when I start pretending I'm a kernel developer. Any thoughts? If I open an RFE would this be something that's relatively easy to implement (not asking for a commitment *to* implement it, just that I'm not asking for something seemingly simple that's actually fairly hard to implement)? -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From chekh at stanford.edu Mon Aug 29 18:50:23 2016 From: chekh at stanford.edu (Alex Chekholko) Date: Mon, 29 Aug 2016 10:50:23 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Message-ID: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Any reason you can't just use iostat or collectl or any of a number of other standards tools to look at disk utilization? On 08/29/2016 10:33 AM, Aaron Knister wrote: > Hi Everyone, > > Would it be easy to have GPFS report iowait values in linux? This would > be a huge help for us in determining whether a node's low utilization is > due to some issue with the code running on it or if it's blocked on I/O, > especially in a historical context. > > I naively tried on a test system changing schedule() in > cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: > > again: > /* call the scheduler */ > if ( waitFlags & INTERRUPTIBLE ) > schedule(); > else > io_schedule(); > > Seems to actually do what I'm after but generally bad things happen when > I start pretending I'm a kernel developer. > > Any thoughts? If I open an RFE would this be something that's relatively > easy to implement (not asking for a commitment *to* implement it, just > that I'm not asking for something seemingly simple that's actually > fairly hard to implement)? > > -Aaron > -- Alex Chekholko chekh at stanford.edu From aaron.s.knister at nasa.gov Mon Aug 29 18:54:12 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:54:12 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Message-ID: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This would >> be a huge help for us in determining whether a node's low utilization is >> due to some issue with the code running on it or if it's blocked on I/O, >> especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen when >> I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's relatively >> easy to implement (not asking for a commitment *to* implement it, just >> that I'm not asking for something seemingly simple that's actually >> fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 18:56:25 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 17:56:25 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> There is the iohist data that may have what you're looking for, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 12:54 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This >> would be a huge help for us in determining whether a node's low >> utilization is due to some issue with the code running on it or if >> it's blocked on I/O, especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen >> when I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's >> relatively easy to implement (not asking for a commitment *to* >> implement it, just that I'm not asking for something seemingly simple >> that's actually fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From olaf.weiser at de.ibm.com Mon Aug 29 19:02:38 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Mon, 29 Aug 2016 20:02:38 +0200 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 19:04:32 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:04:32 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number of >> other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly simple >>> that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:06:36 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:06:36 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From aaron.s.knister at nasa.gov Mon Aug 29 19:09:36 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:09:36 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:11:05 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:11:05 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063147A9@CHI-EXCHANGEW1.w2k.jumptrading.com> That's a good question, but I don't expect it should cause you much of a problem. Of course testing and trying to measure any impact would be wise, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:10 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, >> -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From sfadden at us.ibm.com Mon Aug 29 20:33:14 2016 From: sfadden at us.ibm.com (Scott Fadden) Date: Mon, 29 Aug 2016 12:33:14 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: There is a known performance issue that can possibly cause longer than expected network time-outs if you are running iohist too often. So be careful it is best to collect it as a sample, instead of all of the time. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale From: Aaron Knister To: Date: 08/29/2016 11:09 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Aug 29 20:37:13 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 19:37:13 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Hi Richard, You can of course change any of the other options with the "net conf" (/usr/lpp/mmfs/bin/net conf) command. As its just stored in the Samba registry. Of course whether or not you end up with a supported configuration is a different matter... When we first rolled out CES/SMB, there were a number of issues with setting it up in the way we needed for our environment (AD for auth, LDAP for identity) which at the time wasn't available through the config tools. I believe this has now changed though I haven't gone back and "reset" our configs. Simon ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sobey, Richard A [r.sobey at imperial.ac.uk] Sent: 22 August 2016 14:28 To: 'gpfsug-discuss at spectrumscale.org' Subject: [gpfsug-discuss] CES mmsmb options Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From usa-principal at gpfsug.org Mon Aug 29 21:13:51 2016 From: usa-principal at gpfsug.org (Spectrum Scale Users Group - USA Principal Kristy Kallback-Rose) Date: Mon, 29 Aug 2016 16:13:51 -0400 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event Message-ID: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Hello, I know many of you may be planning your SC16 schedule already. We wanted to give you a heads up that a Spectrum Scale (GPFS) Users Group event is being planned. The event will be much like last year?s event with a combination of technical updates and user experiences and thus far is loosely planned for: Sunday (11/13) ~12p - ~5 PM with a social hour after the meeting. We hope to see you there. More details as planning progresses. Best, Kristy & Bob From S.J.Thompson at bham.ac.uk Mon Aug 29 21:27:28 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 20:27:28 +0000 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event In-Reply-To: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> References: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Message-ID: You may also be interested in a panel session on the Friday of SC16: http://sc16.supercomputing.org/presentation/?id=pan120&sess=sess185 This isn't a user group event, but part of the technical programme for SC16, though I'm sure you will recognise some of the names from the storage community. Moderator: Simon Thompson (me) Panel: Sven Oehme (IBM Research) James Coomer (DDN) Sage Weil (RedHat/CEPH) Colin Morey (Hartree/STFC) Pam Gilman (NCAR) Martin Gasthuber (DESY) Friday 8:30 - 10:00 Simon From volobuev at us.ibm.com Mon Aug 29 21:31:17 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 13:31:17 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: I would advise caution on using "mmdiag --iohist" heavily. In more recent code streams (V4.1, V4.2) there's a problem with internal locking that could, under certain conditions could lead to the symptoms that look very similar to sporadic network blockage. Basically, if "mmdiag --iohist" gets blocked for long periods of time (e.g. due to local disk/NFS performance issues), this may end up blocking an mmfsd receiver thread, delaying RPC processing. The problem was discovered fairly recently, and the fix hasn't made it out to all service streams yet. More generally, IO history is a valuable tool for troubleshooting disk IO performance issues, but the tool doesn't have the right semantics for regular, systemic IO performance sampling and monitoring. The query operation is too expensive, the coverage is subject to load, and the output is somewhat unstructured. With some effort, one can still build some form of a roll-your-own monitoring implement, but this is certainly not an optimal way of approaching the problem. The data should be available in a structured form, through a channel that supports light-weight, flexible querying that doesn't impact mainline IO processing. In Spectrum Scale, this type of data is fed from mmfsd to Zimon, via an mmpmon interface, and end users can then query Zimon for raw or partially processed data. Where it comes to high-volume stats, retaining raw data at its full resolution is only practical for relatively short periods of time (seconds, or perhaps a small number of minutes), and some form of aggregation is necessary for covering longer periods of time (hours to days). In the current versions of the product, there's a very similar type of data available this way: RPC stats. There are plans to make IO history data available in a similar fashion. The entire approach may need to be re-calibrated, however. Making RPC stats available doesn't appear to have generated a surge of user interest. This is probably because the data is too complex for casual processing, and while without doubt a lot of very valuable insight can be gained by analyzing RPC stats, the actual effort required to do so is too much for most users. That is, we need to provide some tools for raw data analytics. Largely the same argument applies to IO stats. In fact, on an NSD client IO stats are actually a subset of RPC stats. With some effort, one can perform a comprehensive analysis of NSD client IO stats by analyzing NSD client-to-server RPC traffic. One can certainly argue that the effort required is a bit much though. Getting back to the original question: would the proposed cxiWaitEventWait () change work? It'll likely result in nr_iowait being incremented every time a thread in GPFS code performs an uninterruptible wait. This could be an act of performing an actual IO request, or something else, e.g. waiting for a lock. Those may be the desirable semantics in some scenarios, but I wouldn't agree that it's the right behavior for any uninterruptible wait. io_schedule() is intended for use for block device IO waits, so using it this way is not in line with the code intent, which is never a good idea. Besides, relative to schedule(), io_schedule() has some overhead that could have performance implications of an uncertain nature. yuri From: Bryan Banister To: gpfsug main discussion list , Date: 08/29/2016 11:06 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From aaron.s.knister at nasa.gov Mon Aug 29 23:58:34 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 18:58:34 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 30 06:09:21 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 22:09:21 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Message-ID: I don't see a simple fix that can be implemented by tweaking a general-purpose low-level synchronization primitive. It should be possible to integrate GPFS better into the Linux IO accounting infrastructure, but that would require some investigation a likely a non-trivial amount of work to do right. yuri From: Aaron Knister To: , Date: 08/29/2016 03:59 PM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Aug 30 09:34:33 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 08:34:33 +0000 Subject: [gpfsug-discuss] CES network aliases Message-ID: Hi all, It's Tuesday morning and that means question time :) So from http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_cesnetworkconfig.htm, I've extracted the following: How to use an alias To use an alias address for CES, you need to provide a static IP address that is not already defined as an alias in the /etc/sysconfig/network-scripts directory. Before you enable the node as a CES node, configure the network adapters for each subnet that are represented in the CES address pool: 1. Define a static IP address for the device: 2. /etc/sysconfig/network-scripts/ifcfg-eth0 3. DEVICE=eth1 4. BOOTPROTO=none 5. IPADDR=10.1.1.10 6. NETMASK=255.255.255.0 7. ONBOOT=yes 8. GATEWAY=10.1.1.1 TYPE=Ethernet 1. Ensure that there are no aliases that are defined in the network-scripts directory for this interface: 10.# ls -l /etc/sysconfig/network-scripts/ifcfg-eth1:* ls: /etc/sysconfig/network-scripts/ifcfg-eth1:*: No such file or directory After the node is enabled as a CES node, no further action is required. CES addresses are added as aliases to the already configured adapters. Now, does this mean for every floating (CES) IP address I need a separate ifcfg-ethX on each node? At the moment I simply have an ifcfg-X file representing each physical network adapter, and then the CES IPs defined. I can see IP addresses being added during failover to the primary interface, but now I've read I potentially need to create a separate file. What's the right way to move forward? If I need separate files, I presume the listed IP is a CES IP (not system) and does it also matter what X is in ifcfg-ethX? Many thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 30 10:54:31 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 30 Aug 2016 09:54:31 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Aug 30 11:30:25 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 10:30:25 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: Ace thanks jf. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 30 August 2016 10:55 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES network aliases You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 15:58:41 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 10:58:41 -0400 Subject: [gpfsug-discuss] Data Replication Message-ID: All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Aug 30 16:03:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 15:03:38 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 17:16:37 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 12:16:37 -0400 Subject: [gpfsug-discuss] gpfs native raid Message-ID: Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Tue Aug 30 17:26:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 16:26:38 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB06316445@CHI-EXCHANGEW1.w2k.jumptrading.com> I believe that Doug is going to provide more details at the NDA session at Edge... see attached, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Tuesday, August 30, 2016 11:17 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An embedded message was scrubbed... From: Douglas O'flaherty Subject: [gpfsug-discuss] Edge Attendees Date: Mon, 29 Aug 2016 05:34:03 +0000 Size: 9615 URL: From cdmaestas at us.ibm.com Tue Aug 30 17:47:18 2016 From: cdmaestas at us.ibm.com (Christopher Maestas) Date: Tue, 30 Aug 2016 16:47:18 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: Message-ID: Interestingly enough, Spectrum Scale can run on zvols. Check out: http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf -cdm On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: From: aaron.s.knister at nasa.gov To: gpfsug-discuss at spectrumscale.org Cc: Date: Aug 30, 2016 9:17:05 AM Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 18:16:03 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 13:16:03 -0400 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <96282850-6bfa-73ae-8502-9e8df3a56390@nasa.gov> Thanks Christopher. I've tried GPFS on zvols a couple times and the write throughput I get is terrible because of the required sync=always parameter. Perhaps a couple of SSD's could help get the number up, though. -Aaron On 8/30/16 12:47 PM, Christopher Maestas wrote: > Interestingly enough, Spectrum Scale can run on zvols. Check out: > > http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf > > -cdm > > ------------------------------------------------------------------------ > On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: > > From: aaron.s.knister at nasa.gov > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: Aug 30, 2016 9:17:05 AM > Subject: [gpfsug-discuss] gpfs native raid > > Does anyone know if/when we might see gpfs native raid opened up for the > masses on non-IBM hardware? It's hard to answer the question of "why > can't GPFS do this? Lustre can" in regards to Lustre's integration with > ZFS and support for RAID on commodity hardware. > -Aaron > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From laurence at qsplace.co.uk Tue Aug 30 19:50:51 2016 From: laurence at qsplace.co.uk (Laurence Horrocks-Barlow) Date: Tue, 30 Aug 2016 20:50:51 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: >The NSD Client handles the replication and will, as you stated, write >one copy to one NSD (using the primary server for this NSD) and one to >a different NSD in a different GPFS failure group (using quite likely, >but not necessarily, a different NSD server that is the primary server >for this alternate NSD). >Cheers, >-Bryan > >From: gpfsug-discuss-bounces at spectrumscale.org >[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian >Marshall >Sent: Tuesday, August 30, 2016 9:59 AM >To: gpfsug main discussion list >Subject: [gpfsug-discuss] Data Replication > >All, > >If I setup a filesystem to have data replication of 2 (2 copies of >data), does the data get replicated at the NSD Server or at the client? >i.e. Does the client send 2 copies over the network or does the NSD >Server get a single copy and then replicate on storage NSDs? > >I couldn't find a place in the docs that talked about this specific >point. > >Thank you, >Brian Marshall > >________________________________ > >Note: This email is for the confidential use of the named addressee(s) >only and may contain proprietary, confidential or privileged >information. If you are not the intended recipient, you are hereby >notified that any review, dissemination or copying of this email is >strictly prohibited, and to please notify the sender immediately and >destroy this email and any attachments. Email transmission cannot be >guaranteed to be secure or error-free. The Company, therefore, does not >make any guarantees as to the completeness or accuracy of this email or >any attachments. This email is for informational purposes only and does >not constitute a recommendation, offer, request or solicitation of any >kind to buy, sell, subscribe, redeem or perform any type of transaction >of a financial product. > > >------------------------------------------------------------------------ > >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at spectrumscale.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 19:52:54 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 14:52:54 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < laurence at qsplace.co.uk> wrote: > Its the client that does all the synchronous replication, this way the > cluster is able to scale as the clients do the leg work (so to speak). > > The somewhat "exception" is if a GPFS NSD server (or client with direct > NSD) access uses a server bases protocol such as SMB, in this case the SMB > server will do the replication as the SMB client doesn't know about GPFS or > its replication; essentially the SMB server is the GPFS client. > > -- Lauz > > On 30 August 2016 17:03:38 CEST, Bryan Banister > wrote: > >> The NSD Client handles the replication and will, as you stated, write one >> copy to one NSD (using the primary server for this NSD) and one to a >> different NSD in a different GPFS failure group (using quite likely, but >> not necessarily, a different NSD server that is the primary server for this >> alternate NSD). >> >> Cheers, >> >> -Bryan >> >> >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss- >> bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >> *Sent:* Tuesday, August 30, 2016 9:59 AM >> *To:* gpfsug main discussion list >> *Subject:* [gpfsug-discuss] Data Replication >> >> >> >> All, >> >> >> >> If I setup a filesystem to have data replication of 2 (2 copies of data), >> does the data get replicated at the NSD Server or at the client? i.e. Does >> the client send 2 copies over the network or does the NSD Server get a >> single copy and then replicate on storage NSDs? >> >> >> >> I couldn't find a place in the docs that talked about this specific point. >> >> >> >> Thank you, >> >> Brian Marshall >> >> >> ------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged information. >> If you are not the intended recipient, you are hereby notified that any >> review, dissemination or copying of this email is strictly prohibited, and >> to please notify the sender immediately and destroy this email and any >> attachments. Email transmission cannot be guaranteed to be secure or >> error-free. The Company, therefore, does not make any guarantees as to the >> completeness or accuracy of this email or any attachments. This email is >> for informational purposes only and does not constitute a recommendation, >> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >> or perform any type of transaction of a financial product. >> >> ------------------------------ >> >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Aug 30 20:09:05 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 30 Aug 2016 19:09:05 +0000 Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon From kevindjo at us.ibm.com Tue Aug 30 20:43:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 19:43:39 +0000 Subject: [gpfsug-discuss] greetings Message-ID: An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Tue Aug 30 21:39:18 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Tue, 30 Aug 2016 22:39:18 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek From kevindjo at us.ibm.com Tue Aug 30 21:51:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 20:51:39 +0000 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: An HTML attachment was scrubbed... URL: From mark.bergman at uphs.upenn.edu Tue Aug 30 22:07:21 2016 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Tue, 30 Aug 2016 17:07:21 -0400 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: Your message of "Tue, 30 Aug 2016 22:39:18 +0200." <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: <24437-1472591241.445832@bR6O.TofS.917u> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman From xhejtman at ics.muni.cz Tue Aug 30 23:02:50 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 00:02:50 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: Message-ID: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek From oehmes at gmail.com Wed Aug 31 00:24:59 2016 From: oehmes at gmail.com (Sven Oehme) Date: Tue, 30 Aug 2016 16:24:59 -0700 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Aug 31 05:00:45 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 31 Aug 2016 04:00:45 +0000 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Just want to add on to one of the points Sven touched on regarding metadata HW. We have a modest SSD infrastructure for our metadata disks and we can scan 500M inodes in parallel in about 5 hours if my memory serves me right (and I believe we could go faster if we really wanted to). I think having solid metadata disks (no pun intended) will really help with scan times. From: Sven Oehme Sent: 8/30/16, 7:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek > wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Aug 31 05:52:57 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 31 Aug 2016 06:52:57 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From dominic.mueller at de.ibm.com Wed Aug 31 06:52:38 2016 From: dominic.mueller at de.ibm.com (Dominic Mueller-Wicke01) Date: Wed, 31 Aug 2016 07:52:38 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: Thanks for reading the paper. I agree that the restore of a large number of files is a challenge today. The restore is the focus area for future enhancements for the integration between IBM Spectrum Scale and IBM Spectrum Protect. If something will be available that helps to improve the restore capabilities the paper will be updated with this information. Greetings, Dominic. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 31.08.2016 01:25 Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ----- Message from "Simon Thompson (Research Computing - IT Services)" on Tue, 30 Aug 2016 19:09:05 +0000 ----- To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 19:43:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings I'm in Lab Services at IBM - just joining and happy to help any way I can. Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Message from Lukas Hejtmanek on Tue, 30 Aug 2016 22:39:18 +0200 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 20:51:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 RHEL 6.8/2.6.32-642 requires 4.1.1.8 or 4.2.1. You can either go to 6.7 for GPFS 3.5 or bump it up to 7.0/7.1. See Table 13, here: http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html?view=kc#linuxq Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Original message ----- From: Lukas Hejtmanek Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Date: Tue, Aug 30, 2016 4:39 PM Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ----- Message from mark.bergman at uphs.upenn.edu on Tue, 30 Aug 2016 17:07:21 -0400 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ----- Message from Lukas Hejtmanek on Wed, 31 Aug 2016 00:02:50 +0200 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ----- Message from Sven Oehme on Tue, 30 Aug 2016 16:24:59 -0700 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From xhejtman at ics.muni.cz Wed Aug 31 08:03:08 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 09:03:08 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: <20160831070308.fiogolgc2nhna6ir@ics.muni.cz> On Wed, Aug 31, 2016 at 07:52:38AM +0200, Dominic Mueller-Wicke01 wrote: > Thanks for reading the paper. I agree that the restore of a large number of > files is a challenge today. The restore is the focus area for future > enhancements for the integration between IBM Spectrum Scale and IBM > Spectrum Protect. If something will be available that helps to improve the > restore capabilities the paper will be updated with this information. I guess that one of the reasons that restore is slow is because this: (strace dsmc) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en/_referencenotitsig", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home", F_OK) = 0 [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum", F_OK) = 0 it seems that dsmc tests access again and again up to root for each item in the file list if I set different location where to place the restored files. -- Luk?? Hejtm?nek From duersch at us.ibm.com Wed Aug 31 13:45:12 2016 From: duersch at us.ibm.com (Steve Duersch) Date: Wed, 31 Aug 2016 08:45:12 -0400 Subject: [gpfsug-discuss] Maximum value for data replication? In-Reply-To: References: Message-ID: >>Is there a maximum value for data replication in Spectrum Scale? The maximum value for replication is 3. Steve Duersch Spectrum Scale RAID 845-433-7902 IBM Poughkeepsie, New York From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 08/30/2016 07:25 PM Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ---------------------------------------------------------------------- Message: 1 Date: Tue, 30 Aug 2016 19:09:05 +0000 From: "Simon Thompson (Research Computing - IT Services)" To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Content-Type: text/plain; charset="us-ascii" Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ------------------------------ Message: 2 Date: Tue, 30 Aug 2016 19:43:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/5a2e22a3/attachment-0001.html > ------------------------------ Message: 3 Date: Tue, 30 Aug 2016 22:39:18 +0200 From: Lukas Hejtmanek To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ------------------------------ Message: 4 Date: Tue, 30 Aug 2016 20:51:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/341d5e11/attachment-0001.html > ------------------------------ Message: 5 Date: Tue, 30 Aug 2016 17:07:21 -0400 From: mark.bergman at uphs.upenn.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <24437-1472591241.445832 at bR6O.TofS.917u> Content-Type: text/plain; charset="UTF-8" In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ------------------------------ Message: 6 Date: Wed, 31 Aug 2016 00:02:50 +0200 From: Lukas Hejtmanek To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <20160830220250.yt6r7gvfq7rlvtcs at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ------------------------------ Message: 7 Date: Tue, 30 Aug 2016 16:24:59 -0700 From: Sven Oehme To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Content-Type: text/plain; charset="utf-8" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/d9b3fb68/attachment.html > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 55 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From daniel.kidger at uk.ibm.com Wed Aug 31 15:32:11 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 31 Aug 2016 14:32:11 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: Message-ID: The other 'Exception' is when a rule is used to convert a 1 way replicated file to 2 way, or when only one failure group is up due to HW problems. It that case the (re-replication) is done by whatever nodes are used for the rule or command-line, which may include an NSD server. Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: From: mimarsh2 at vt.edu To: gpfsug-discuss at spectrumscale.org Cc: Date: 30 Aug 2016 19:53:31 Subject: Re: [gpfsug-discuss] Data Replication Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow wrote: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discussUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Wed Aug 31 19:01:45 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Wed, 31 Aug 2016 14:01:45 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Daniel, So here's my use case: I have a Sandisk IF150 (branded as DeepFlash recently) with 128TB of flash acting as a "fast tier" storage pool in our HPC scratch file system. Can I set the filesystem replication level to 1 then write a policy engine rule to send small and/or recent files to the IF150 with a replication of 2? Any other comments on the proposed usage strategy are helpful. Thank you, Brian Marshall On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger wrote: > The other 'Exception' is when a rule is used to convert a 1 way replicated > file to 2 way, or when only one failure group is up due to HW problems. It > that case the (re-replication) is done by whatever nodes are used for the > rule or command-line, which may include an NSD server. > > Daniel > > IBM Spectrum Storage Software > +44 (0)7818 522266 <+44%207818%20522266> > Sent from my iPad using IBM Verse > > > ------------------------------ > On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: > > From: mimarsh2 at vt.edu > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: 30 Aug 2016 19:53:31 > Subject: Re: [gpfsug-discuss] Data Replication > > > Thanks. This confirms the numbers that I am seeing. > > Brian > > On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < > laurence at qsplace.co.uk> wrote: > >> Its the client that does all the synchronous replication, this way the >> cluster is able to scale as the clients do the leg work (so to speak). >> >> The somewhat "exception" is if a GPFS NSD server (or client with direct >> NSD) access uses a server bases protocol such as SMB, in this case the SMB >> server will do the replication as the SMB client doesn't know about GPFS or >> its replication; essentially the SMB server is the GPFS client. >> >> -- Lauz >> >> On 30 August 2016 17:03:38 CEST, Bryan Banister < >> bbanister at jumptrading.com> wrote: >> >>> The NSD Client handles the replication and will, as you stated, write >>> one copy to one NSD (using the primary server for this NSD) and one to a >>> different NSD in a different GPFS failure group (using quite likely, but >>> not necessarily, a different NSD server that is the primary server for this >>> alternate NSD). >>> >>> Cheers, >>> >>> -Bryan >>> >>> >>> >>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>> *To:* gpfsug main discussion list >>> *Subject:* [gpfsug-discuss] Data Replication >>> >>> >>> >>> All, >>> >>> >>> >>> If I setup a filesystem to have data replication of 2 (2 copies of >>> data), does the data get replicated at the NSD Server or at the client? >>> i.e. Does the client send 2 copies over the network or does the NSD Server >>> get a single copy and then replicate on storage NSDs? >>> >>> >>> >>> I couldn't find a place in the docs that talked about this specific >>> point. >>> >>> >>> >>> Thank you, >>> >>> Brian Marshall >>> >>> >>> ------------------------------ >>> >>> Note: This email is for the confidential use of the named addressee(s) >>> only and may contain proprietary, confidential or privileged information. >>> If you are not the intended recipient, you are hereby notified that any >>> review, dissemination or copying of this email is strictly prohibited, and >>> to please notify the sender immediately and destroy this email and any >>> attachments. Email transmission cannot be guaranteed to be secure or >>> error-free. The Company, therefore, does not make any guarantees as to the >>> completeness or accuracy of this email or any attachments. This email is >>> for informational purposes only and does not constitute a recommendation, >>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>> or perform any type of transaction of a financial product. >>> >>> ------------------------------ >>> >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Aug 31 19:10:07 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 31 Aug 2016 14:10:07 -0400 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" - how about a Billion files in 140 seconds? In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: When you write something like "mmbackup takes ages" - that let's us know how you feel, kinda. But we need some facts and data to make a determination if there is a real problem and whether and how it might be improved. Just to do a "back of the envelope" estimate of how long backup operations "ought to" take - we'd need to know how many disks and/or SSDs with what performance characteristics, how many nodes withf what performance characteristics, network "fabric(s)", Number of files to be scanned, Average number of files per directory, GPFS blocksize(s) configured, Backup devices available with speeds and feeds, etc, etc. But anyway just to throw ballpark numbers "out there" to give you an idea of what is possible. I can tell you that a 20 months ago Sven and I benchmarked mmapplypolicy scanning 983 Million files in 136 seconds! The command looked like this: mmapplypolicy /ibm/fs2-1m-p01/shared/Btt -g /ibm/fs2-1m-p01/tmp -d 7 -A 256 -a 32 -n 8 -P /ghome/makaplan/sventests/milli.policy -I test -L 1 -N fastclients fastclients was 10 X86_64 commodity nodes The fs2-1m-p01 file system was hosted on just two IBM GSS nodes and everything was on an Infiniband switch. We packed about 7000 files into each directory.... (This admittedly may not be typical...) This is NOT to say you could back up that many files that fast, but Spectrum Scale metadata scanning can be fast, even with relatively modest hardware resources. YMMV ;-) Marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Wed Aug 31 19:39:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 20:39:26 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: <20160831183926.k4mbwbbrmxybd7a3@ics.muni.cz> On Tue, Aug 30, 2016 at 04:24:59PM -0700, Sven Oehme wrote: > so lets start with some simple questions. > > when you say mmbackup takes ages, what version of gpfs code are you running > ? that was GPFS 3.5.0-8. The mmapplypolicy took over 2 hours but that was the least problem. We developed our own set of backups scripts around mmbackup to address these issues: 1) while mmbackup is running, you cannot run another instance on the same file system. 2) mmbackup can be very slow, but not mmbackup itself but consecutive dsmc selective, sorry for being misleading, but mainly due to the large number of files to be backed up 3) related to the previous, mmbackup scripts seem to be executing a 'grep' cmd for every input file to check whether it has entry in dmsc output log. well guess what happens if you have millions of files at the input and several gigabytes in dsmc outpu log... In our case, the grep storm took several *weeks*. 4) very surprisingly, some of the files were not backed up at all. We cannot find why but dsmc incremental found some old files that were not covered by mmbackup backups. Maybe because the mmbackup process was not gracefully terminated in some cases (node crash) and so on. > how do you execute the mmbackup command ? exact parameters would be useful > . /usr/lpp/mmfs/bin/mmbackup tape_tape -t incremental -v -N fe1 -P ${POLICY_FILE} --tsm-servers SERVER1 -g /gpfs/clusterbase/tmp/ -s /tmp -m 4 -B 9999999999999 -L 0 we had external exec script that split files from policy into chunks that were run in parallel. > what HW are you using for the metadata disks ? 4x SSD > how much capacity (df -h) and how many inodes (df -i) do you have in the > filesystem you try to backup ? df -h /dev/tape_tape 1.5P 745T 711T 52% /exports/tape_tape df -hi /dev/tape_tape 1.3G 98M 1.2G 8% /exports/tape_tape (98M inodes used) mmdf tape_tape disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 175 TB) nsd_t1_5 23437934592 1 No Yes 7342735360 ( 31%) 133872128 ( 1%) nsd_t1_6 23437934592 1 No Yes 7341166592 ( 31%) 133918784 ( 1%) nsd_t1b_2 23437934592 1 No Yes 7343919104 ( 31%) 134165056 ( 1%) nsd_t1b_3 23437934592 1 No Yes 7341283328 ( 31%) 133986560 ( 1%) nsd_ssd_4 770703360 2 Yes No 692172800 ( 90%) 15981952 ( 2%) nsd_ssd_3 770703360 2 Yes No 692252672 ( 90%) 15921856 ( 2%) nsd_ssd_2 770703360 2 Yes No 692189184 ( 90%) 15928832 ( 2%) nsd_ssd_1 770703360 2 Yes No 692197376 ( 90%) 16013248 ( 2%) ------------- -------------------- ------------------- (pool total) 96834551808 32137916416 ( 33%) 599788416 ( 1%) Disks in storage pool: maid (Maximum disk size allowed is 466 TB) nsd8_t2_12 31249989632 1 No Yes 13167828992 ( 42%) 36282048 ( 0%) nsd8_t2_13 31249989632 1 No Yes 13166729216 ( 42%) 36131072 ( 0%) nsd8_t2_14 31249989632 1 No Yes 13166886912 ( 42%) 36371072 ( 0%) nsd8_t2_15 31249989632 1 No Yes 13168209920 ( 42%) 36681728 ( 0%) nsd8_t2_16 31249989632 1 No Yes 13165176832 ( 42%) 36279488 ( 0%) nsd8_t2_17 31249989632 1 No Yes 13159870464 ( 42%) 36002560 ( 0%) nsd8_t2_46 31249989632 1 No Yes 29624694784 ( 95%) 81600 ( 0%) nsd8_t2_45 31249989632 1 No Yes 29623111680 ( 95%) 77184 ( 0%) nsd8_t2_44 31249989632 1 No Yes 29621467136 ( 95%) 61440 ( 0%) nsd8_t2_43 31249989632 1 No Yes 29622964224 ( 95%) 64640 ( 0%) nsd8_t2_18 31249989632 1 No Yes 13166675968 ( 42%) 36147648 ( 0%) nsd8_t2_19 31249989632 1 No Yes 13164529664 ( 42%) 36225216 ( 0%) nsd8_t2_20 31249989632 1 No Yes 13165223936 ( 42%) 36242368 ( 0%) nsd8_t2_21 31249989632 1 No Yes 13167353856 ( 42%) 36007744 ( 0%) nsd8_t2_31 31249989632 1 No Yes 13116979200 ( 42%) 14155200 ( 0%) nsd8_t2_32 31249989632 1 No Yes 13115633664 ( 42%) 14243840 ( 0%) nsd8_t2_33 31249989632 1 No Yes 13115830272 ( 42%) 14235392 ( 0%) nsd8_t2_34 31249989632 1 No Yes 13119727616 ( 42%) 14500608 ( 0%) nsd8_t2_35 31249989632 1 No Yes 13116925952 ( 42%) 14304192 ( 0%) nsd8_t2_0 31249989632 1 No Yes 13145503744 ( 42%) 99222016 ( 0%) nsd8_t2_36 31249989632 1 No Yes 13119858688 ( 42%) 14054784 ( 0%) nsd8_t2_37 31249989632 1 No Yes 13114101760 ( 42%) 14200704 ( 0%) nsd8_t2_38 31249989632 1 No Yes 13116483584 ( 42%) 14174720 ( 0%) nsd8_t2_39 31249989632 1 No Yes 13121257472 ( 42%) 14094720 ( 0%) nsd8_t2_40 31249989632 1 No Yes 29622908928 ( 95%) 84352 ( 0%) nsd8_t2_1 31249989632 1 No Yes 13146089472 ( 42%) 99566784 ( 0%) nsd8_t2_2 31249989632 1 No Yes 13146208256 ( 42%) 99128960 ( 0%) nsd8_t2_3 31249989632 1 No Yes 13146890240 ( 42%) 99766720 ( 0%) nsd8_t2_4 31249989632 1 No Yes 13145143296 ( 42%) 98992576 ( 0%) nsd8_t2_5 31249989632 1 No Yes 13135876096 ( 42%) 99555008 ( 0%) nsd8_t2_6 31249989632 1 No Yes 13142831104 ( 42%) 99728064 ( 0%) nsd8_t2_7 31249989632 1 No Yes 13140283392 ( 42%) 99412480 ( 0%) nsd8_t2_8 31249989632 1 No Yes 13143470080 ( 42%) 99653696 ( 0%) nsd8_t2_9 31249989632 1 No Yes 13143650304 ( 42%) 99224704 ( 0%) nsd8_t2_10 31249989632 1 No Yes 13145440256 ( 42%) 99238528 ( 0%) nsd8_t2_11 31249989632 1 No Yes 13143201792 ( 42%) 99283008 ( 0%) nsd8_t2_22 31249989632 1 No Yes 13171724288 ( 42%) 36040704 ( 0%) nsd8_t2_23 31249989632 1 No Yes 13166782464 ( 42%) 36212416 ( 0%) nsd8_t2_24 31249989632 1 No Yes 13167990784 ( 42%) 35842368 ( 0%) nsd8_t2_25 31249989632 1 No Yes 13166972928 ( 42%) 36086848 ( 0%) nsd8_t2_26 31249989632 1 No Yes 13167495168 ( 42%) 36114496 ( 0%) nsd8_t2_27 31249989632 1 No Yes 13164419072 ( 42%) 36119680 ( 0%) nsd8_t2_28 31249989632 1 No Yes 13167804416 ( 42%) 36088832 ( 0%) nsd8_t2_29 31249989632 1 No Yes 13166057472 ( 42%) 36107072 ( 0%) nsd8_t2_30 31249989632 1 No Yes 13163673600 ( 42%) 36102528 ( 0%) nsd8_t2_41 31249989632 1 No Yes 29620840448 ( 95%) 70208 ( 0%) nsd8_t2_42 31249989632 1 No Yes 29621110784 ( 95%) 69568 ( 0%) ------------- -------------------- ------------------- (pool total) 1468749512704 733299890176 ( 50%) 2008331584 ( 0%) ============= ==================== =================== (data) 1562501251072 762668994560 ( 49%) 2544274112 ( 0%) (metadata) 3082813440 2768812032 ( 90%) 63845888 ( 2%) ============= ==================== =================== (total) 1565584064512 765437806592 ( 49%) 2608120000 ( 0%) Inode Information ----------------- Number of used inodes: 102026081 Number of free inodes: 72791199 Number of allocated inodes: 174817280 Maximum number of inodes: 1342177280 -- Luk?? Hejtm?nek From xhejtman at ics.muni.cz Wed Aug 31 20:26:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 21:26:26 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <24437-1472591241.445832@bR6O.TofS.917u> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> Message-ID: <20160831192626.k4em4iz7ne2e2cmg@ics.muni.cz> Hello, thank you for explanation. I confirm that things are working with 573 kernel. On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: > In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, > The pithy ruminations from Lukas Hejtmanek on > <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: > => Hello, > > GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, > but at kernel 2.6.32-573 and lower. > > I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel > revs that caused multipath errors, resulting in GPFS being unable to > find all NSDs and mount the filesystem. > > I am not updating to a newer kernel until I'm certain this is resolved. > > I opened a bug with CentOS: > > https://bugs.centos.org/view.php?id=10997 > > and began an extended discussion with the (RH & SUSE) developers of that > chunk of kernel code. I don't know if an upstream bug has been opened > by RH, but see: > > https://patchwork.kernel.org/patch/9140337/ > => > => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the > => latest patch 32) does start but does not mount and file system. The internal > => mount cmd gets stucked. > => > => -- > => Luk?? Hejtm?nek > > > -- > Mark Bergman voice: 215-746-4061 > mark.bergman at uphs.upenn.edu fax: 215-614-0266 > http://www.cbica.upenn.edu/ > IT Technical Director, Center for Biomedical Image Computing and Analytics > Department of Radiology University of Pennsylvania > PGP Key: http://www.cbica.upenn.edu/sbia/bergman > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Luk?? Hejtm?nek From wilshire at mcs.anl.gov Wed Aug 31 20:39:17 2016 From: wilshire at mcs.anl.gov (John Blaas) Date: Wed, 31 Aug 2016 14:39:17 -0500 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> Message-ID: We are running 3.5 w/ patch 32 on nodes with the storage cluster running on Centos 6.8 with kernel at 2.6.32-642.1.1 and the remote compute cluster running 2.6.32-642.3.1 without any issues. That being said we are looking to upgrade as soon as possible to 4.1, but thought I would add that it is possible even if not supported. --- John Blaas On Wed, Aug 31, 2016 at 2:26 PM, Lukas Hejtmanek wrote: > Hello, > > thank you for explanation. I confirm that things are working with 573 kernel. > > On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: >> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, >> The pithy ruminations from Lukas Hejtmanek on >> <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: >> => Hello, >> >> GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, >> but at kernel 2.6.32-573 and lower. >> >> I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel >> revs that caused multipath errors, resulting in GPFS being unable to >> find all NSDs and mount the filesystem. >> >> I am not updating to a newer kernel until I'm certain this is resolved. >> >> I opened a bug with CentOS: >> >> https://bugs.centos.org/view.php?id=10997 >> >> and began an extended discussion with the (RH & SUSE) developers of that >> chunk of kernel code. I don't know if an upstream bug has been opened >> by RH, but see: >> >> https://patchwork.kernel.org/patch/9140337/ >> => >> => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the >> => latest patch 32) does start but does not mount and file system. The internal >> => mount cmd gets stucked. >> => >> => -- >> => Luk?? Hejtm?nek >> >> >> -- >> Mark Bergman voice: 215-746-4061 >> mark.bergman at uphs.upenn.edu fax: 215-614-0266 >> http://www.cbica.upenn.edu/ >> IT Technical Director, Center for Biomedical Image Computing and Analytics >> Department of Radiology University of Pennsylvania >> PGP Key: http://www.cbica.upenn.edu/sbia/bergman >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From janfrode at tanso.net Wed Aug 31 21:44:04 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 31 Aug 2016 22:44:04 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Assuming your DeepFlash pool is named "deep", something like the following should work: RULE 'deepreplicate' migrate from pool 'deep' to pool 'deep' replicate(2) where MISC_ATTRIBUTES NOT LIKE '%2%' and POOL_NAME LIKE 'deep' "mmapplypolicy gpfs0 -P replicate-policy.pol -I yes" and possibly "mmrestripefs gpfs0 -r" afterwards. -jf On Wed, Aug 31, 2016 at 8:01 PM, Brian Marshall wrote: > Daniel, > > So here's my use case: I have a Sandisk IF150 (branded as DeepFlash > recently) with 128TB of flash acting as a "fast tier" storage pool in our > HPC scratch file system. Can I set the filesystem replication level to 1 > then write a policy engine rule to send small and/or recent files to the > IF150 with a replication of 2? > > Any other comments on the proposed usage strategy are helpful. > > Thank you, > Brian Marshall > > On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger > wrote: > >> The other 'Exception' is when a rule is used to convert a 1 way >> replicated file to 2 way, or when only one failure group is up due to HW >> problems. It that case the (re-replication) is done by whatever nodes are >> used for the rule or command-line, which may include an NSD server. >> >> Daniel >> >> IBM Spectrum Storage Software >> +44 (0)7818 522266 <+44%207818%20522266> >> Sent from my iPad using IBM Verse >> >> >> ------------------------------ >> On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: >> >> From: mimarsh2 at vt.edu >> To: gpfsug-discuss at spectrumscale.org >> Cc: >> Date: 30 Aug 2016 19:53:31 >> Subject: Re: [gpfsug-discuss] Data Replication >> >> >> Thanks. This confirms the numbers that I am seeing. >> >> Brian >> >> On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < >> laurence at qsplace.co.uk> wrote: >> >>> Its the client that does all the synchronous replication, this way the >>> cluster is able to scale as the clients do the leg work (so to speak). >>> >>> The somewhat "exception" is if a GPFS NSD server (or client with direct >>> NSD) access uses a server bases protocol such as SMB, in this case the SMB >>> server will do the replication as the SMB client doesn't know about GPFS or >>> its replication; essentially the SMB server is the GPFS client. >>> >>> -- Lauz >>> >>> On 30 August 2016 17:03:38 CEST, Bryan Banister < >>> bbanister at jumptrading.com> wrote: >>> >>>> The NSD Client handles the replication and will, as you stated, write >>>> one copy to one NSD (using the primary server for this NSD) and one to a >>>> different NSD in a different GPFS failure group (using quite likely, but >>>> not necessarily, a different NSD server that is the primary server for this >>>> alternate NSD). >>>> >>>> Cheers, >>>> >>>> -Bryan >>>> >>>> >>>> >>>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>>> *To:* gpfsug main discussion list >>>> *Subject:* [gpfsug-discuss] Data Replication >>>> >>>> >>>> >>>> All, >>>> >>>> >>>> >>>> If I setup a filesystem to have data replication of 2 (2 copies of >>>> data), does the data get replicated at the NSD Server or at the client? >>>> i.e. Does the client send 2 copies over the network or does the NSD Server >>>> get a single copy and then replicate on storage NSDs? >>>> >>>> >>>> >>>> I couldn't find a place in the docs that talked about this specific >>>> point. >>>> >>>> >>>> >>>> Thank you, >>>> >>>> Brian Marshall >>>> >>>> >>>> ------------------------------ >>>> >>>> Note: This email is for the confidential use of the named addressee(s) >>>> only and may contain proprietary, confidential or privileged information. >>>> If you are not the intended recipient, you are hereby notified that any >>>> review, dissemination or copying of this email is strictly prohibited, and >>>> to please notify the sender immediately and destroy this email and any >>>> attachments. Email transmission cannot be guaranteed to be secure or >>>> error-free. The Company, therefore, does not make any guarantees as to the >>>> completeness or accuracy of this email or any attachments. This email is >>>> for informational purposes only and does not constitute a recommendation, >>>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>>> or perform any type of transaction of a financial product. >>>> >>>> ------------------------------ >>>> >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Mon Aug 1 17:42:19 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 1 Aug 2016 09:42:19 -0700 Subject: [gpfsug-discuss] Spectrum Scale 4.2.1 Released In-Reply-To: References: Message-ID: Thanks for sharing Bob. Since some folks asked previously, if you go to the 4.2.1 FAQ PDF version there will be change bars on the left for what changed in FAQ from previous version as well as a FAQ July updates table near the top to quickly highlight the changes from last FAQ. http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.pdf?view=kc Also, two short blogs on the 4.2.1 release on the Storage Community might be of interest: http://storagecommunity.org/easyblog -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 19:36:15 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 14:36:15 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) Message-ID: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. From oehmes at gmail.com Mon Aug 1 19:49:37 2016 From: oehmes at gmail.com (Sven Oehme) Date: Mon, 1 Aug 2016 11:49:37 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: > I have enabled write cache (HAWC) by running the below commands. The > recovery logs are supposedly placed in the replicated system metadata pool > (SSDs). I do not have a "system.log" pool as it is only needed if recovery > logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including the NSD > nodes). > > I still see small synchronous writes (4K) from the clients going to the > data drives (data pool). I am checking this by looking at "mmdiag --iohist" > output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is working as > advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raot at bnl.gov Mon Aug 1 20:05:52 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 15:05:52 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> Message-ID: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), they don't > leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao > wrote: > > I have enabled write cache (HAWC) by running the below commands. > The recovery logs are supposedly placed in the replicated system > metadata pool (SSDs). I do not have a "system.log" pool as it is > only needed if recovery logs are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster (including > the NSD nodes). > > I still see small synchronous writes (4K) from the clients going > to the data drives (data pool). I am checking this by looking at > "mmdiag --iohist" output. Should they not be going to the system pool? > > Do I need to do something else? How can I confirm that HAWC is > working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From dhildeb at us.ibm.com Mon Aug 1 20:50:09 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 12:50:09 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ?? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From raot at bnl.gov Mon Aug 1 21:42:06 2016 From: raot at bnl.gov (Tejas Rao) Date: Mon, 1 Aug 2016 16:42:06 -0400 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov> <5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> Message-ID: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: > > Hi Tejas, > > Do you know the workload in the VM? > > The workload which enters into HAWC may or may not be the same as the > workload that eventually goes into the data pool....it all depends on > whether the 4KB writes entering HAWC can be coalesced or not. For > example, sequential 4KB writes can all be coalesced into a single > large chunk. So 4KB writes into HAWC will convert into 8MB writes to > data pool (in your system). But random 4KB writes into HAWC may end up > being 4KB writes into the data pool if there are no adjoining 4KB > writes (i.e., if 4KB blocks are all dispersed, they can't be > coalesced). The goal of HAWC though, whether the 4KB blocks are > coalesced or not, is to reduce app latency by ensuring that writing > the blocks back to the data pool is done in the background. So while > 4KB blocks may still be hitting the data pool, hopefully the > application is seeing the latency of your presumably lower latency > system pool. > > Dean > > > Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my > case GPFS storage is used to store VM images (KVM) and heTejas Rao > ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store > VM images (KVM) and hence the small IO. > > From: Tejas Rao > To: gpfsug main discussion list > Date: 08/01/2016 12:06 PM > Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > In my case GPFS storage is used to store VM images (KVM) and hence the > small IO. > > I always see lots of small 4K writes and the GPFS filesystem block > size is 8MB. I thought the reason for the small writes is that the > linux kernel requests GPFS to initiate a periodic sync which by > default is every 5 seconds and can be controlled by > "vm.dirty_writeback_centisecs". > > I thought HAWC would help in such cases and would harden (coalesce) > the small writes in the "system" pool and would flush to the "data" > pool in larger block size. > > Note - I am not doing direct i/o explicitly. > > > > On 8/1/2016 14:49, Sven Oehme wrote: > > when you say 'synchronous write' what do you mean by that ? > if you are talking about using direct i/o (O_DIRECT flag), > they don't leverage HAWC data path, its by design. > > sven > > On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao <_raot at bnl.gov_ > > wrote: > I have enabled write cache (HAWC) by running the below > commands. The recovery logs are supposedly placed in the > replicated system metadata pool (SSDs). I do not have a > "system.log" pool as it is only needed if recovery logs > are stored on the client nodes. > > mmchfs gpfs01 --write-cache-threshold 64K > mmchfs gpfs01 -L 1024M > mmchconfig logPingPongSector=no > > I have recycled the daemon on all nodes in the cluster > (including the NSD nodes). > > I still see small synchronous writes (4K) from the clients > going to the data drives (data pool). I am checking this > by looking at "mmdiag --iohist" output. Should they not be > going to the system pool? > > Do I need to do something else? How can I confirm that > HAWC is working as advertised? > > Thanks. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at _spectrumscale.org_ > _ > __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 105 bytes Desc: not available URL: From dhildeb at us.ibm.com Mon Aug 1 21:55:28 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 1 Aug 2016 13:55:28 -0700 Subject: [gpfsug-discuss] HAWC (Highly available write cache) In-Reply-To: <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> References: <7953aa8c-904a-cee5-34be-7d40e55b46db@bnl.gov><5629f550-05c9-25dd-bbe1-bdea618e8ae0@bnl.gov> <04707e32-83fc-f42d-10cf-99139c136371@bnl.gov> Message-ID: Hi Tejas, Yes, most likely those 4k writes are the HAWC writes...hopefully those 4KB writes have a lower latency than the 4k writes to your data pool so you are realizing the benefits. Dean From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 01:42 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org I am not 100% sure what the workload of the VMs is. We have 100's of VMs all used differently, so the workload is rather mixed. I do see 4K writes going to "system" pool, they are tagged as "logData" in 'mmdiag --iohist'. But I also see 4K writes going to the data drives, so it looks like everything is not getting coalesced and these are random writes. Could these 4k writes labelled as "logData" be the writes going to HAWC log files? On 8/1/2016 15:50, Dean Hildebrand wrote: Hi Tejas, Do you know the workload in the VM? The workload which enters into HAWC may or may not be the same as the workload that eventually goes into the data pool....it all depends on whether the 4KB writes entering HAWC can be coalesced or not. For example, sequential 4KB writes can all be coalesced into a single large chunk. So 4KB writes into HAWC will convert into 8MB writes to data pool (in your system). But random 4KB writes into HAWC may end up being 4KB writes into the data pool if there are no adjoining 4KB writes (i.e., if 4KB blocks are all dispersed, they can't be coalesced). The goal of HAWC though, whether the 4KB blocks are coalesced or not, is to reduce app latency by ensuring that writing the blocks back to the data pool is done in the background. So while 4KB blocks may still be hitting the data pool, hopefully the application is seeing the latency of your presumably lower latency system pool. Dean Inactive hide details for Tejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and heTejas Rao ---08/01/2016 12:06:15 PM---In my case GPFS storage is used to store VM images (KVM) and hence the small IO. From: Tejas Rao To: gpfsug main discussion list Date: 08/01/2016 12:06 PM Subject: Re: [gpfsug-discuss] HAWC (Highly available write cache) Sent by: gpfsug-discuss-bounces at spectrumscale.org In my case GPFS storage is used to store VM images (KVM) and hence the small IO. I always see lots of small 4K writes and the GPFS filesystem block size is 8MB. I thought the reason for the small writes is that the linux kernel requests GPFS to initiate a periodic sync which by default is every 5 seconds and can be controlled by "vm.dirty_writeback_centisecs". I thought HAWC would help in such cases and would harden (coalesce) the small writes in the "system" pool and would flush to the "data" pool in larger block size. Note - I am not doing direct i/o explicitly. On 8/1/2016 14:49, Sven Oehme wrote: when you say 'synchronous write' what do you mean by that ? if you are talking about using direct i/o (O_DIRECT flag), they don't leverage HAWC data path, its by design. sven On Mon, Aug 1, 2016 at 11:36 AM, Tejas Rao wrote: I have enabled write cache (HAWC) by running the below commands. The recovery logs are supposedly placed in the replicated system metadata pool (SSDs). I do not have a "system.log" pool as it is only needed if recovery logs are stored on the client nodes. mmchfs gpfs01 --write-cache-threshold 64K mmchfs gpfs01 -L 1024M mmchconfig logPingPongSector=no I have recycled the daemon on all nodes in the cluster (including the NSD nodes). I still see small synchronous writes (4K) from the clients going to the data drives (data pool). I am checking this by looking at "mmdiag --iohist" output. Should they not be going to the system pool? Do I need to do something else? How can I confirm that HAWC is working as advertised? Thanks. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Wed Aug 3 06:06:32 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 05:06:32 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain Message-ID: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Wed Aug 3 08:32:43 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 07:32:43 +0000 Subject: [gpfsug-discuss] SS 4.2.1.0 upgrade pain In-Reply-To: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> References: <04fbf3c0ae40468d912293821905197d@exch1-cdc.nexus.csiro.au> Message-ID: <663114b24b0b403aa076a83791f32c58@exch1-cdc.nexus.csiro.au> And I am seeing the same behaviour on a SLES 12 SP1 update from 4.2.04 to 4.2.1.0. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Greg.Lehmann at csiro.au Sent: Wednesday, 3 August 2016 3:07 PM To: gpfsug-discuss at spectrumscale.org Subject: [ExternalEmail] [gpfsug-discuss] SS 4.2.1.0 upgrade pain On Debian I am seeing this when trying to upgrade: mmshutdown dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb (Reading database ... 65194 files and directories currently installed.) Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.base ... Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ... Unpacking replacement gpfs.docs ... Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ... Unpacking replacement gpfs.ext ... Etc. Unpacking replacement gpfs.gpl ... Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ... Unpacking replacement gpfs.gskit ... Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ... Unpacking replacement gpfs.msg.en-us ... Setting up gpfs.base (4.2.1-0) ... At which point it hangs. A ps shows this: ps -ef | grep mm root 21269 1 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 21276 21150 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start root 21363 1 0 14:18 ? 00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes root 22485 21276 0 14:18 pts/0 00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py root 22486 22485 0 14:18 pts/0 00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c root 22488 22486 1 14:18 pts/0 00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c root 24420 22488 0 14:18 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c root 24439 24420 0 14:18 pts/0 00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c root 24446 24439 0 14:18 pts/0 00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT= /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c ' root 24546 21269 0 14:23 pts/0 00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15 root 24548 24455 0 14:23 pts/1 00:00:00 grep mm It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0. Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Wed Aug 3 09:54:30 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Wed, 3 Aug 2016 10:54:30 +0200 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Message-ID: <57A1B146.9070505@ugent.be> Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth From bbanister at jumptrading.com Wed Aug 3 15:53:52 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Wed, 3 Aug 2016 14:53:52 +0000 Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 In-Reply-To: <57A1B146.9070505@ugent.be> References: <57A1B146.9070505@ugent.be> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062B3718@CHI-EXCHANGEW1.w2k.jumptrading.com> Your first process is correct. Install the 4.2.0-0 rpms first, then install the 4.2.1 rpms after. -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kenneth Waegeman Sent: Wednesday, August 03, 2016 3:55 AM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Upgrade from 4.1.1 to 4.2.1 Hi, In the upgrade procedure (prerequisites) of 4.2.1, I read: "If you are coming from 4.1.1-X, you must first upgrade to 4.2.0-0. You may use this 4.2.1-0 package to perform a First Time Install or to upgrade from an existing 4.2.0-X level." What does this mean exactly. Should we just install the 4.2.0 rpms first, and then the 4.2.1 rpms, or should we install the 4.2.0 rpms, start up gpfs, bring gpfs down again and then do the 4.2.1 rpms? But if we re-install a 4.1.1 node, we can immediately install 4.2.1 ? Thanks! Kenneth _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From pinto at scinet.utoronto.ca Wed Aug 3 17:22:27 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:22:27 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? Message-ID: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From oehmes at gmail.com Wed Aug 3 17:35:39 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 09:35:39 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for that > user? > > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user be > stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of > Toronto. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 17:41:24 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:41:24 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803124124.21815zz1w4exmuus@support.scinet.utoronto.ca> Quoting "Sven Oehme" : > Hi, > > quotas are only counted against primary group > > sven Thanks Sven I kind of suspected, but needed an independent confirmation. Jaime > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: > >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for that >> user? >> >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user be >> stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> >> ---------------------------------------------------------------- >> This message was sent using IMP at SciNet Consortium, University of >> Toronto. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From jonathan at buzzard.me.uk Wed Aug 3 17:44:01 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 17:44:01 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> On 03/08/16 17:22, Jaime Pinto wrote: > Suppose I want to set both USR and GRP quotas for a user, however GRP is > not the primary group. Will gpfs enforce the secondary group quota for > that user? Nope that's not how POSIX schematics work for group quotas. As far as I can tell only your primary group is used for group quotas. It basically makes group quotas in Unix a waste of time in my opinion. At least I have never come across a real world scenario where they work in a useful manner. > What I mean is, if the user keeps writing files with secondary group as > the attribute, and that overall group quota is reached, will that user > be stopped by gpfs? > File sets are the answer to your problems, but retrospectively applying them to a file system is a pain. You create a file set for a directory and can then apply a quota to the file set. Even better you can apply per file set user and group quotas. So if file set A has a 1TB quota you could limit user X to 100GB in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From pinto at scinet.utoronto.ca Wed Aug 3 17:55:43 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 12:55:43 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP is >> not the primary group. Will gpfs enforce the secondary group quota for >> that user? > > Nope that's not how POSIX schematics work for group quotas. As far as I > can tell only your primary group is used for group quotas. It basically > makes group quotas in Unix a waste of time in my opinion. At least I > have never come across a real world scenario where they work in a > useful manner. > >> What I mean is, if the user keeps writing files with secondary group as >> the attribute, and that overall group quota is reached, will that user >> be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively applying > them to a file system is a pain. You create a file set for a directory > and can then apply a quota to the file set. Even better you can apply > per file set user and group quotas. So if file set A has a 1TB quota > you could limit user X to 100GB in the file set, but outside the file > set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:06:34 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:06:34 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Wed Aug 3 19:30:08 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Wed, 03 Aug 2016 14:30:08 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Quoting "Buterbaugh, Kevin L" : > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > > Thanks for clarifying? > > Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary group > as the attribute, and that overall group quota is reached, will > that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:34:21 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:34:21 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Kevin On Aug 3, 2016, at 1:30 PM, Jaime Pinto > wrote: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 19:46:54 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 19:46:54 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group ?group2?. > And let?s say that they write to a directory where the bit on the > directory forces all files created in that directory to have group2 > associated with them. Are you saying that those files still count > against group1?s group quota??? > Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 19:55:01 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 18:55:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> Message-ID: JAB, The set group id bit is tangential to my point. I expect GPFS to count any files a user owns against their user quota. If they are a member of multiple groups then I also expect it to count it against the group quota of whatever group is associated with that file. I.e., if they do a chgrp then GPFS should subtract from one group and add to another. Kevin On Aug 3, 2016, at 1:46 PM, Jonathan Buzzard > wrote: On 03/08/16 19:06, Buterbaugh, Kevin L wrote: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Yeah, but bastard user from hell over here then does chgrp group1 myevilfile.txt and your set group id bit becomes irrelevant because it is only ever indicative. In fact there is nothing that guarantees the set group id bit is honored because there is nothing stopping the user or a program coming in immediately after the file is created and changing that. Not pointing fingers at the OSX SMB client when Unix extensions are active on a Samba server in any way there. As such Unix group quotas are in the real world a total waste of space. This is if you ask me why XFS and Lustre have project quotas and GPFS has file sets. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Wed Aug 3 20:13:09 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Wed, 3 Aug 2016 20:13:09 +0100 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> Message-ID: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 3 20:18:11 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 3 Aug 2016 19:18:11 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> Message-ID: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> JAB, Our scratch filesystem uses user and group quotas. It started out as a traditional scratch filesystem but then we decided (for better or worse) to allow groups to purchase quota on it (and we don?t purge it, as many sites do). We have many users in multiple groups, so if this is not working right it?s a potential issue for us. But you?re right, I?m a nobody? Kevin On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: On 03/08/16 19:34, Buterbaugh, Kevin L wrote: Hi Jaime / Sven, If Jaime?s interpretation is correct about user1 continuing to be able to write to ?group2? files even though that group is at their hard limit, then that?s a bug that needs fixing. I haven?t tested that myself, and we?re in a downtime right now so I?m a tad bit busy, but if I need to I?ll test it on our test cluster later this week. Even if Jamie's interpretation is wrong it shows the other massive failure of group quotas under Unix and why they are not fit for purpose in the real world. So bufh here can deliberately or accidentally do a denial of service on other users and tracking down the offending user is a right pain in the backside. The point of being able to change group ownership on a file is to indicate the massive weakness of the whole group quota system, and why in my experience nobody actually uses it, and "project" quota options have been implemented in many "enterprise" Unix file systems. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at gmail.com Wed Aug 3 21:32:32 2016 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 3 Aug 2016 13:32:32 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <78DAAA7C-C0C2-42C2-B6B9-B5EC6CC3A3F4@vanderbilt.edu> <2b823e10-34e8-ce9d-956c-267df4e6042b@buzzard.me.uk> <6B06DA37-321E-4730-A3D1-61E41E4C6187@vanderbilt.edu> Message-ID: i can't contribute much to the usefulness of tracking primary or secondary group. depending on who you ask you get a 50/50 answer why its great or broken either way. Jonathan explanation was correct, we only track/enforce primary groups , we don't do anything with secondary groups in regards to quotas. if there is 'doubt' of correct quotation of files on the disk in the filesystem one could always run mmcheckquota, its i/o intensive but will match quota usage of the in memory 'assumption' and update it from the actual data thats stored on disk. sven On Wed, Aug 3, 2016 at 12:18 PM, Buterbaugh, Kevin L < Kevin.Buterbaugh at vanderbilt.edu> wrote: > JAB, > > Our scratch filesystem uses user and group quotas. It started out as a > traditional scratch filesystem but then we decided (for better or worse) to > allow groups to purchase quota on it (and we don?t purge it, as many sites > do). > > We have many users in multiple groups, so if this is not working right > it?s a potential issue for us. But you?re right, I?m a nobody? > > Kevin > > On Aug 3, 2016, at 2:13 PM, Jonathan Buzzard > wrote: > > On 03/08/16 19:34, Buterbaugh, Kevin L wrote: > > Hi Jaime / Sven, > > If Jaime?s interpretation is correct about user1 continuing to be able > to write to ?group2? files even though that group is at their hard > limit, then that?s a bug that needs fixing. I haven?t tested that > myself, and we?re in a downtime right now so I?m a tad bit busy, but if > I need to I?ll test it on our test cluster later this week. > > > Even if Jamie's interpretation is wrong it shows the other massive failure > of group quotas under Unix and why they are not fit for purpose in the real > world. > > So bufh here can deliberately or accidentally do a denial of service on > other users and tracking down the offending user is a right pain in the > backside. > > The point of being able to change group ownership on a file is to indicate > the massive weakness of the whole group quota system, and why in my > experience nobody actually uses it, and "project" quota options have been > implemented in many "enterprise" Unix file systems. > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Thu Aug 4 00:03:47 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Wed, 3 Aug 2016 23:03:47 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> <20160803125543.11831ypcdi8i189b@support.scinet.utoronto.ca> Message-ID: <762ff4f5796c4992b3bceb23b26fdbf3@exch1-cdc.nexus.csiro.au> The GID selection rules for account creation are Linux distribution specific. It sounds like you are familiar with Red Hat, where I think this idea of GID=UID started. sles12sp1-brc:/dev/disk/by-uuid # useradd testout sles12sp1-brc:/dev/disk/by-uuid # grep testout /etc/passwd testout:x:1001:100::/home/testout:/bin/bash sles12sp1-brc:/dev/disk/by-uuid # grep 100 /etc/group users:x:100: sles12sp1-brc:/dev/disk/by-uuid # Cheers, Greg -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jaime Pinto Sent: Thursday, 4 August 2016 2:56 AM To: gpfsug main discussion list ; Jonathan Buzzard Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? I guess I have a bit of a puzzle to solve, combining quotas on filesets, paths and USR/GRP attributes So much for the "standard" built-in linux account creation script, in which by default every new user is created with primary GID=UID, doesn't really help any of us. Jaime Quoting "Jonathan Buzzard" : > On 03/08/16 17:22, Jaime Pinto wrote: >> Suppose I want to set both USR and GRP quotas for a user, however GRP >> is not the primary group. Will gpfs enforce the secondary group quota >> for that user? > > Nope that's not how POSIX schematics work for group quotas. As far as > I can tell only your primary group is used for group quotas. It > basically makes group quotas in Unix a waste of time in my opinion. At > least I have never come across a real world scenario where they work > in a useful manner. > >> What I mean is, if the user keeps writing files with secondary group >> as the attribute, and that overall group quota is reached, will that >> user be stopped by gpfs? >> > > File sets are the answer to your problems, but retrospectively > applying them to a file system is a pain. You create a file set for a > directory and can then apply a quota to the file set. Even better you > can apply per file set user and group quotas. So if file set A has a > 1TB quota you could limit user X to 100GB in the file set, but outside > the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > > -- > Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk > Fife, United Kingdom. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Greg.Lehmann at csiro.au Thu Aug 4 03:41:55 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 4 Aug 2016 02:41:55 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation Message-ID: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.waegeman at ugent.be Thu Aug 4 09:13:29 2016 From: kenneth.waegeman at ugent.be (Kenneth Waegeman) Date: Thu, 4 Aug 2016 10:13:29 +0200 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: <57A2F929.8000003@ugent.be> This is new, it is explained how they are merged at http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1xx_soc.htm Cheers! K On 04/08/16 04:41, Greg.Lehmann at csiro.au wrote: > > I see only 4 pdfs now with slightly different titles to the previous 5 > pdfs available with 4.2.0. Just checking there are only supposed to be > 4 now? > > Greg > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Thu Aug 4 09:13:51 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:13:51 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Thu Aug 4 09:17:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Thu, 4 Aug 2016 08:17:01 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: Ah. Dependent vs independent. (10,000 and 1000 respectively). -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sobey, Richard A Sent: 04 August 2016 09:14 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? 1000 isn't it?! We've always worked on that assumption. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard Sent: 03 August 2016 17:44 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? in the file set, but outside the file set they could have a different quota or even no quota. Only issue is a limit of ~10,000 file sets per file system JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From st.graf at fz-juelich.de Thu Aug 4 09:20:42 2016 From: st.graf at fz-juelich.de (Stephan Graf) Date: Thu, 4 Aug 2016 10:20:42 +0200 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <891fb362-ac69-2803-3664-1a55087868dc@buzzard.me.uk> Message-ID: <57A2FADA.1060508@fz-juelich.de> Hi! I have tested it with dependent filesets in GPFS 4.1.1.X and there the limit is 10.000. Stephan On 08/04/16 10:13, Sobey, Richard A wrote: > 1000 isn't it?! We've always worked on that assumption. > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jonathan Buzzard > Sent: 03 August 2016 17:44 > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? > in the file set, but outside the file set they could have a different quota or even no quota. > > Only issue is a limit of ~10,000 file sets per file system > > > JAB. > -- Stephan Graf Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Phone: +49-2461-61-6578 Fax: +49-2461-61-6656 E-mail: st.graf at fz-juelich.de WWW: http://www.fz-juelich.de/jsc/ ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From daniel.kidger at uk.ibm.com Thu Aug 4 09:22:36 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 4 Aug 2016 08:22:36 +0000 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? GregUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 16:59:31 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 11:59:31 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> Message-ID: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" : > Quoting "Buterbaugh, Kevin L" : > >> Hi Sven, >> >> Wait - am I misunderstanding something here? Let?s say that I have >> ?user1? who has primary group ?group1? and secondary group >> ?group2?. And let?s say that they write to a directory where the >> bit on the directory forces all files created in that directory to >> have group2 associated with them. Are you saying that those >> files still count against group1?s group quota??? >> >> Thanks for clarifying? >> >> Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > >> >> On Aug 3, 2016, at 11:35 AM, Sven Oehme >> > wrote: >> >> Hi, >> >> quotas are only counted against primary group >> >> sven >> >> >> On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto >> > wrote: >> Suppose I want to set both USR and GRP quotas for a user, however >> GRP is not the primary group. Will gpfs enforce the secondary group >> quota for that user? >> >> What I mean is, if the user keeps writing files with secondary >> group as the attribute, and that overall group quota is reached, >> will that user be stopped by gpfs? >> >> Thanks >> Jaime >> >> >> >> >> ************************************ >> TELL US ABOUT YOUR SUCCESS STORIES >> http://www.scinethpc.ca/testimonials >> ************************************ >> --- >> Jaime Pinto >> SciNet HPC Consortium - Compute/Calcul Canada >> www.scinet.utoronto.ca - >> www.computecanada.org >> University of Toronto >> 256 McCaul Street, Room 235 >> Toronto, ON, M5T1W5 >> P: 416-978-2755 >> C: 416-505-1477 >> > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 4 17:08:30 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 4 Aug 2016 16:08:30 +0000 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> Message-ID: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Hi Jaime, Thank you sooooo much for doing this and reporting back the results! They?re in line with what I would expect to happen. I was going to test this as well, but we have had to extend our downtime until noontime tomorrow, so I haven?t had a chance to do so yet. Now I don?t have to? ;-) Kevin On Aug 4, 2016, at 10:59 AM, Jaime Pinto > wrote: Since there were inconsistencies in the responses, I decided to rig a couple of accounts/groups on our LDAP to test "My interpretation", and determined that I was wrong. When Kevin mentioned it would mean a bug I had to double-check: If a user hits the hard quota or exceeds the grace period on the soft quota on any of the secondary groups that user will be stopped from further writing to those groups as well, just as in the primary group. I hope this clears the waters a bit. I still have to solve my puzzle. Thanks everyone for the feedback. Jaime Quoting "Jaime Pinto" >: Quoting "Buterbaugh, Kevin L" >: Hi Sven, Wait - am I misunderstanding something here? Let?s say that I have ?user1? who has primary group ?group1? and secondary group ?group2?. And let?s say that they write to a directory where the bit on the directory forces all files created in that directory to have group2 associated with them. Are you saying that those files still count against group1?s group quota??? Thanks for clarifying? Kevin Not really, My interpretation is that all files written with group2 will count towards the quota on that group. However any users with group2 as the primary group will be prevented from writing any further when the group2 quota is reached. However the culprit user1 with primary group as group1 won't be detected by gpfs, and can just keep going on writing group2 files. As far as the individual user quota, it doesn't matter: group1 or group2 it will be counted towards the usage of that user. It would be interesting if the behavior was more as expected. I just checked with my Lustre counter-parts and they tell me whichever secondary group is hit first, however many there may be, the user will be stopped. The problem then becomes identifying which of the secondary groups hit the limit for that user. Jaime On Aug 3, 2016, at 11:35 AM, Sven Oehme > wrote: Hi, quotas are only counted against primary group sven On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > wrote: Suppose I want to set both USR and GRP quotas for a user, however GRP is not the primary group. Will gpfs enforce the secondary group quota for that user? What I mean is, if the user keeps writing files with secondary group as the attribute, and that overall group quota is reached, will that user be stopped by gpfs? Thanks Jaime ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Thu Aug 4 17:34:09 2016 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Thu, 04 Aug 2016 12:34:09 -0400 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca> <20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca> <20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca> <7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> Message-ID: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > > > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From Kevin.Buterbaugh at Vanderbilt.Edu Wed Aug 10 22:00:26 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 10 Aug 2016 21:00:26 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 10 22:04:11 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 10 Aug 2016 21:04:11 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? In-Reply-To: References: Message-ID: <95126B16-B4DB-4406-862B-AA81E37F04E6@nuance.com> We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Buterbaugh, Kevin L" Reply-To: gpfsug main discussion list Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From malone12 at illinois.edu Wed Aug 10 22:43:15 2016 From: malone12 at illinois.edu (Maloney, John Daniel) Date: Wed, 10 Aug 2016 21:43:15 +0000 Subject: [gpfsug-discuss] User group meeting at SC16? Message-ID: <4AD486D7-D452-465A-85EC-1BDDE2C5DCFD@illinois.edu> Hi Bob, Thanks for the update! The couple storage folks from NCSA going to SC16 won?t be available Sunday (I?m not able to get in until Monday morning). Agree completely there is never a perfect time, just giving our feedback. Thanks again, J.D. Maloney Storage Engineer | Storage Enabling Technologies Group National Center for Supercomputing Applications (NCSA) From: > on behalf of "Oesterlin, Robert" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:04 PM To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] User group meeting at SC16? We're still trying to schedule that - The thinking right now is staying where last year. (Sunday afternoon) There is never a perfect time at these sorts of event - bound to step on something! If anyone has feedback (positive or negative) - let us know. Look for a formal announcement in early September. Bob Oesterlin GPFS-UG Co-Principal Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Buterbaugh, Kevin L" > Reply-To: gpfsug main discussion list > Date: Wednesday, August 10, 2016 at 4:00 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] User group meeting at SC16? Hi All, Just got an e-mail from DDN announcing that they are holding their user group meeting at SC16 on Monday afternoon like they always do, which is prompting me to inquire if IBM is going to be holding a meeting at SC16? Last year in Austin the IBM meeting was on Sunday afternoon, which worked out great as far as I was concerned. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Thu Aug 11 05:47:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Thu, 11 Aug 2016 00:47:17 -0400 Subject: [gpfsug-discuss] GPFS and SELinux Message-ID: Hi Everyone, I'm passing this along on behalf of one of our security guys. Just wondering what feedback/thoughts others have on the topic. Current IBM guidance on GPFS and SELinux indicates that the default context for services (initrc_t) is insufficient for GPFS operations. See: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/Using+GPFS+with+SElinux That part is true (by design), but IBM goes further to say use runcon out of rc.local and configure the gpfs service to not start via init. I believe these latter two (rc.local/runcon and no-init) can be addressed, relatively trivially, through the application of a small selinux policy. Ideally, I would hope for IBM to develop, test, and send out the policy, but I'm happy to offer the following suggestions. I believe "a)" could be developed in a relatively short period of time. "b)" would take more time, effort and experience. a) consider SELinux context transition. As an example, consider: https://github.com/TresysTechnology/refpolicy/tree/master/policy/modules/services (specifically, the ssh components) On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, and runs under sshd_t Referencing ssh.te, you see several references to sshd_exec_t in: domtrans_pattern init_daemon_domain daemontools_service_domain (and so on) These configurations allow init to fire sshd off, setting its runtime context to sshd_t, based on the file context of sshd_exec_t. This should be duplicable for the gpfs daemon, altho I note it seems to be fired through a layer of abstraction in mmstartup. A simple policy that allows INIT to transition GPFS to unconfined_t would go a long way towards easing integration. b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when executed, would pick up a context of gpfs_t? Which then could be mapped through standard SELinux policy to allow access to configuration files (gpfs_etc_t?), block devices, etc? I admit, in b, I am speculating heavily. -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From janfrode at tanso.net Thu Aug 11 10:54:27 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 11 Aug 2016 11:54:27 +0200 Subject: [gpfsug-discuss] GPFS and SELinux In-Reply-To: References: Message-ID: I believe the runcon part is no longer necessary, at least on my RHEL7 based systems mmfsd is running unconfined by default: [root at flexscale01 ~]# ps -efZ|grep mmfsd unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 18018 17709 0 aug.05 ? 00:24:53 /usr/lpp/mmfs/bin/mmfsd and I've never seen any problems with that for base GPFS. I suspect doing a proper selinux domain for GPFS will be quite close to unconfined, so maybe not worth the effort... -jf On Thu, Aug 11, 2016 at 6:47 AM, Aaron Knister wrote: > Hi Everyone, > > I'm passing this along on behalf of one of our security guys. Just > wondering what feedback/thoughts others have on the topic. > > > Current IBM guidance on GPFS and SELinux indicates that the default > context for services (initrc_t) is insufficient for GPFS operations. > > See: > https://www.ibm.com/developerworks/community/wikis/home? > lang=en#!/wiki/General+Parallel+File+System+(GPFS)/ > page/Using+GPFS+with+SElinux > > > That part is true (by design), but IBM goes further to say use runcon > out of rc.local and configure the gpfs service to not start via init. > > I believe these latter two (rc.local/runcon and no-init) can be > addressed, relatively trivially, through the application of a small > selinux policy. > > Ideally, I would hope for IBM to develop, test, and send out the policy, > but I'm happy to offer the following suggestions. I believe "a)" could > be developed in a relatively short period of time. "b)" would take more > time, effort and experience. > > a) consider SELinux context transition. > > As an example, consider: > https://github.com/TresysTechnology/refpolicy/tree/master/ > policy/modules/services > > > (specifically, the ssh components) > > On a normal centOS/RHEL system sshd has the file context of sshd_exec_t, > and runs under sshd_t > > Referencing ssh.te, you see several references to sshd_exec_t in: > domtrans_pattern > init_daemon_domain > daemontools_service_domain > (and so on) > > These configurations allow init to fire sshd off, setting its runtime > context to sshd_t, based on the file context of sshd_exec_t. > > This should be duplicable for the gpfs daemon, altho I note it seems to > be fired through a layer of abstraction in mmstartup. > > A simple policy that allows INIT to transition GPFS to unconfined_t > would go a long way towards easing integration. > > b) file contexts of gpfs_daemon_t and gpfs_util_t, perhaps, that when > executed, would pick up a context of gpfs_t? Which then could be mapped > through standard SELinux policy to allow access to configuration files > (gpfs_etc_t?), block devices, etc? > > I admit, in b, I am speculating heavily. > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Fri Aug 12 20:40:27 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Fri, 12 Aug 2016 19:40:27 +0000 Subject: [gpfsug-discuss] HPCwire Readers Choice Message-ID: Reminder... Get your stories in today! To view this email in your browser, click here. Last Call for Readers' Choice Award Nominations! Deadline: Friday, August 12th at 11:50pm! Only 3 days left until nominations for the 2016 HPCwire Readers' Choice Awards come to a close! Be sure to submit your picks for the best in HPC and make your voice heard before it's too late! These annual awards are a way for our community to recognize the best and brightest innovators within the global HPC community. Time is running out for you to nominate what you think are the greatest achievements in HPC for 2016, so cast your ballot today! The 2016 Categories Include the Following: * Best Use of HPC Application in Life Sciences * Best Use of HPC Application in Manufacturing * Best Use of HPC Application in Energy (previously 'Oil and Gas') * Best Use of HPC in Automotive * Best Use of HPC in Financial Services * Best Use of HPC in Entertainment * Best Use of HPC in the Cloud * Best Use of High Performance Data Analytics * Best Implementation of Energy-Efficient HPC * Best HPC Server Product or Technology * Best HPC Storage Product or Technology * Best HPC Software Product or Technology * Best HPC Visualization Product or Technology * Best HPC Interconnect Product or Technology * Best HPC Cluster Solution or Technology * Best Data-Intensive System (End-User Focused) * Best HPC Collaboration Between Government & Industry * Best HPC Collaboration Between Academia & Industry * Top Supercomputing Achievement * Top 5 New Products or Technologies to Watch * Top 5 Vendors to Watch * Workforce Diversity Leadership Award * Outstanding Leadership in HPC Nominations are accepted from readers, users, vendors - virtually anyone who is connected to the HPC community and is a reader of HPCwire. Nominations will close on August 12, 2016 at 11:59pm. Make your voice heard! Help tell the story of HPC in 2016 by submitting your nominations for the HPCwire Readers' Choice Awards now! Nominations close on August 12, 2016. All nominations are subject to review by the editors of HPCwire with only the most relevant being accepted. Voting begins August 22, 2015. The final presentation of these prestigious and highly anticipated awards to each organization's leading executives will take place live during SC '16 in Salt Lake City, UT. The finalist(s) in each category who receive the most votes will win this year's awards. Open to HPCwire readers only. HPCwire Subscriber Services This email was sent to lwestoby at us.ibm.com. You are receiving this email message as an HPCwire subscriber. To forward this email to a friend, click here. Unsubscribe from this list. Copyright ? 2016 Tabor Communications Inc. All rights reserved. 8445 Camino Santa Fe San Diego, California 92121 P: 858.625.0070 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 40078 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 5880 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Mon Aug 15 10:59:34 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 09:59:34 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they're on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 12:22:31 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 11:22:31 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Aug 15 13:45:25 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 15 Aug 2016 12:45:25 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: Message-ID: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 15 13:58:47 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 15 Aug 2016 12:58:47 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamiedavis at us.ibm.com Mon Aug 15 14:02:13 2016 From: jamiedavis at us.ibm.com (James Davis) Date: Mon, 15 Aug 2016 13:02:13 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: References: , <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Aug 15 14:05:01 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 15 Aug 2016 13:05:01 +0000 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? Message-ID: <28479088-C492-4441-A761-F49E1556E13E@nuance.com> PTF = Program Temporary Fix. IBM-Speak for a fix for a particular problem. Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Sobey, Richard A" Reply-To: gpfsug main discussion list Date: Monday, August 15, 2016 at 7:58 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Thanks Kevin and Bob. PTF = minor version? I can?t think what it might stand for. Something Time Fix? Point in time fix? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Buterbaugh, Kevin L Sent: 15 August 2016 13:45 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Minor GPFS versions coexistence problems? Richard, I will second what Bob said with one caveat ? on one occasion we had an issue with our multi-cluster setup because the PTF?s were incompatible. However, that was clearly documented in the release notes, which we obviously hadn?t read carefully enough. While we generally do rolling upgrades over a two to three week period, we have run for months with clients at differing PTF levels. HTHAL? Kevin On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert > wrote: In general, yes, it's common practice to do the 'rolling upgrades'. If I had to do my whole cluster at once, with an outage, I'd probably never upgrade. :) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Sobey, Richard A" > Reply-To: gpfsug main discussion list > Date: Monday, August 15, 2016 at 4:59 AM To: "'gpfsug-discuss at spectrumscale.org'" > Subject: [EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence problems? Hi all, If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger it over a few days, perhaps up to 2 weeks or will I run into problems if they?re on different versions? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kdball at us.ibm.com Mon Aug 15 15:12:07 2016 From: kdball at us.ibm.com (Keith D Ball) Date: Mon, 15 Aug 2016 14:12:07 +0000 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 55, Issue 16 In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jake.carroll at uq.edu.au Mon Aug 15 22:08:58 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Mon, 15 Aug 2016 21:08:58 +0000 Subject: [gpfsug-discuss] More on AFM cache chaining Message-ID: <94AB3BCD-B551-4F3E-9128-65B582A4ABC6@uq.edu.au> Hi there. In the spirit of a conversation a friend showed me a couple of weeks ago from Radhika Parameswaran and Luke Raimbach, we?re doing something similar to Luke (kind of), or at least attempting it, in regards to cache chaining. We?ve got a large research storage platform in Brisbane, Queensland, Australia and we?re trying to leverage a few different modes of operation. Currently: Cache A (IW) connects to what would be a Home (B) which then is effectively an NFS mount to (C) a DMF based NFS export. To a point, this works. It kind of allows us to use ?home? as the ultimate sink, and data migration in and out of DMF seems to be working nicely when GPFS pulls things from (B) which don?t appear to currently be in (A) due to policy, or a HWM was hit (thus emptying cache). We?ve tested it as far out as the data ONLY being offline in tape media inside (C) and it still works, cleanly coming back to (A) within a very reasonable time-frame. ? We hit ?problem 1? which is in and around NFS v4 ACL?s which aren?t surfacing or mapping correctly (as we?d expect). I guess this might be the caveat of trying to backend the cache to a home and have it sitting inside DMF (over an NFS Export) for surfacing of the data for clients. Where we?d like to head: We haven?t seen it yet, but as Luke and Radhika were discussing last month, we really liked the idea of an IW Cache (A, where instruments dump huge data) which then via AFM ends up at (B) (might also be technically ?home? but IW) which is then also a function of (C) which might also be another cache that sits next to a HPC platform for reading and writing data into quickly and out of in parallel. We like the idea of chained caches because it gives us extremely flexibility in the premise of our ?Data anywhere? fabric. We appreciate that this has some challenges, in that we know if you?ve got multiple IW scenarios the last write will always win ? this we can control with workload guidelines. But we?d like to add our voices to this idea of having caches chained all the way back to some point such that data is being pulled all the way from C --> B --> A and along the way, inflection points of IO might be written and read at point C and point B AND point A such that everyone would see the distribution and consistent data in the end. We?re also working on surfacing data via object and file simultaneously for different needs. This is coming along relatively well, but we?re still learning about where and where this does not make sense so far. A moving target, from how it all appears on the surface. Some might say that is effectively asking for a globally eventually (always) consistent filesystem within Scale?. Anyway ? just some thoughts. Regards, -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 03:22:17 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 15 Aug 2016 22:22:17 -0400 Subject: [gpfsug-discuss] mmfsadm test pit Message-ID: I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 16 16:21:13 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:21:13 -0700 Subject: [gpfsug-discuss] 4.2.1 documentation In-Reply-To: References: <8033d4a67d9745f4a52f148538423066@exch1-cdc.nexus.csiro.au> Message-ID: Light Weight Event support is not fully baked yet, and thus not documented. It's getting there. yuri From: "Daniel Kidger" To: "gpfsug main discussion list" , Cc: "gpfsug-discuss" Date: 08/04/2016 01:23 AM Subject: Re: [gpfsug-discuss] 4.2.1 documentation Sent by: gpfsug-discuss-bounces at spectrumscale.org Yes they have been re arranged. My observation is that the Admin and Advanced Admin have merged into one PDFs, and the DMAPI manual is now a chapter of the new Programming guide (along with the complete set of man pages which have moved out of the Admin guide). Table 3 on page 26 of the Concepts, Planning and Install guide describes these change. IMHO The new format is much better as all Admin is in one place not two. ps. I couldn't find in the programming guide a chapter yet on Light Weight Events. Anyone in product development care to comment? :-) Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 4 Aug 2016, 03:42:21, Greg.Lehmann at csiro.au wrote: From: Greg.Lehmann at csiro.au To: gpfsug-discuss at spectrumscale.org Cc: Date: 4 Aug 2016 03:42:21 Subject: [gpfsug-discuss] 4.2.1 documentation I see only 4 pdfs now with slightly different titles to the previous 5 pdfs available with 4.2.0. Just checking there are only supposed to be 4 now? Greg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From volobuev at us.ibm.com Tue Aug 16 16:42:33 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Tue, 16 Aug 2016 08:42:33 -0700 Subject: [gpfsug-discuss] quota on secondary groups for a user? In-Reply-To: <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> References: <20160803122227.13743yk89dn1dbur@support.scinet.utoronto.ca><20160803143008.11673qj876dtqjw0@support.scinet.utoronto.ca><20160804115931.26601tycacksqhcz@support.scinet.utoronto.ca><7C0606E3-37D9-4301-8676-5060A0984FF2@vanderbilt.edu> <20160804123409.18403cy3iz123gxt@support.scinet.utoronto.ca> Message-ID: This is a long discussion thread, touching on several related subjects, but as far as the original "secondary groups" question, things are quite simple. A file in a Unix file system has an owning user and an owning group. Those are two IDs that are stored in the inode on disk, and those IDs are used to charge the corresponding user and group quotas. Exactly how the owning GID gets set is an entirely separate question. It may be the current user's primary group, or a secondary group, or a result of chown, etc. To GPFS code it doesn't matter what supplementary GIDs a given thread has in its security context for the purposes of charging group quota, the only thing that matters is the GID in the file inode. yuri From: "Jaime Pinto" To: "gpfsug main discussion list" , "Buterbaugh, Kevin L" , Date: 08/04/2016 09:34 AM Subject: Re: [gpfsug-discuss] quota on secondary groups for a user? Sent by: gpfsug-discuss-bounces at spectrumscale.org OK More info: Users can apply the 'sg group1' or 'sq group2' command from a shell or script to switch the group mask from that point on, and dodge the quota that may have been exceeded on a group. However, as the group owner or other member of the group on the limit, I could not find a tool they can use on their own to find out who is(are) the largest user(s); 'du' takes too long, and some users don't give read permissions on their directories. As part of the puzzle solution I have to come up with a root wrapper that can make the contents of the mmrepquota report available to them. Jaime Quoting "Buterbaugh, Kevin L" : > Hi Jaime, > > Thank you sooooo much for doing this and reporting back the results! > They?re in line with what I would expect to happen. I was going > to test this as well, but we have had to extend our downtime until > noontime tomorrow, so I haven?t had a chance to do so yet. Now I > don?t have to? ;-) > > Kevin > > On Aug 4, 2016, at 10:59 AM, Jaime Pinto > > wrote: > > Since there were inconsistencies in the responses, I decided to rig > a couple of accounts/groups on our LDAP to test "My interpretation", > and determined that I was wrong. When Kevin mentioned it would mean > a bug I had to double-check: > > If a user hits the hard quota or exceeds the grace period on the > soft quota on any of the secondary groups that user will be stopped > from further writing to those groups as well, just as in the primary > group. > > I hope this clears the waters a bit. I still have to solve my puzzle. > > Thanks everyone for the feedback. > Jaime > > > > Quoting "Jaime Pinto" > >: > > Quoting "Buterbaugh, Kevin L" > >: > > Hi Sven, > > Wait - am I misunderstanding something here? Let?s say that I have > ?user1? who has primary group ?group1? and secondary group > ?group2?. And let?s say that they write to a directory where the > bit on the directory forces all files created in that directory to > have group2 associated with them. Are you saying that those files > still count against group1?s group quota??? > > Thanks for clarifying? > > Kevin > > Not really, > > My interpretation is that all files written with group2 will count > towards the quota on that group. However any users with group2 as the > primary group will be prevented from writing any further when the > group2 quota is reached. However the culprit user1 with primary group > as group1 won't be detected by gpfs, and can just keep going on writing > group2 files. > > As far as the individual user quota, it doesn't matter: group1 or > group2 it will be counted towards the usage of that user. > > It would be interesting if the behavior was more as expected. I just > checked with my Lustre counter-parts and they tell me whichever > secondary group is hit first, however many there may be, the user will > be stopped. The problem then becomes identifying which of the secondary > groups hit the limit for that user. > > Jaime > > > > On Aug 3, 2016, at 11:35 AM, Sven Oehme > > > wrote: > > Hi, > > quotas are only counted against primary group > > sven > > > On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto > < mailto:pinto at scinet.utoronto.ca>> > wrote: > Suppose I want to set both USR and GRP quotas for a user, however > GRP is not the primary group. Will gpfs enforce the secondary group > quota for that user? > > What I mean is, if the user keeps writing files with secondary > group as the attribute, and that overall group quota is reached, > will that user be stopped by gpfs? > > Thanks > Jaime > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca< http://www.scinet.utoronto.ca/> - > www.computecanada.org< http://www.computecanada.org/> > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > ************************************ > TELL US ABOUT YOUR SUCCESS STORIES > http://www.scinethpc.ca/testimonials > ************************************ > --- > Jaime Pinto > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - > www.computecanada.org > University of Toronto > 256 McCaul Street, Room 235 > Toronto, ON, M5T1W5 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - > (615)875-9633 > > > > ************************************ TELL US ABOUT YOUR SUCCESS STORIES http://www.scinethpc.ca/testimonials ************************************ --- Jaime Pinto SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.org University of Toronto 256 McCaul Street, Room 235 Toronto, ON, M5T1W5 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Tue Aug 16 16:59:13 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 16 Aug 2016 15:59:13 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up Message-ID: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Tue Aug 16 17:13:17 2016 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Tue, 16 Aug 2016 16:13:17 +0000 Subject: [gpfsug-discuss] Attending IBM Edge? Sessions of note and possible meet-up In-Reply-To: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> References: <29EA4D63-8885-42C5-876C-D68EB9E1CFDE@nuance.com> Message-ID: <57c145ab-4207-7550-af57-ff07d6ac8f2d@mdanderson.org> I am speaking: SNP-2408 : Implementing a Research Storage Environment Using IBM Spectrum Software at MD Anderson Cancer Center Program : Enabling Cognitive IT with Storage and Software Defined Solutions Track : Building Oceans of Data Session Type : Breakout Session Date/Time : Tue, 20-Sep, 05:00 PM-06:00 PM Location : MGM Grand - Room 104 Presenter(s):Jonathan Fosburgh, UT MD Anderson This is primarily dealing with Scale and Archive, and also includes Protect. -- Jonathan Fosburgh Principal Application Systems Analyst Storage Team IT Operations jfosburg at mdanderson.org (713) 745-9346 On 08/16/2016 10:59 AM, Oesterlin, Robert wrote: For those of you on the mailing list attending the IBM Edge conference in September, there will be at least one NDA session on Spectrum Scale and its future directions. I've heard that there will be a session on licensing as well. (always a hot topic). I have a couple of talks: Spectrum Scale with Transparent Cloud Tiering and on Spectrum Scale with Spectrum Control. I'll try and organize some sort of informal meetup one of the nights - thoughts on when would be welcome. Probably not Tuesday night, as that's the entertainment night. :-) Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Tue Aug 16 22:09:35 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 16 Aug 2016 17:09:35 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: I was surprised to read that Ctrl-C did not really kill restripe. It's supposed to! If it doesn't that's a bug. I ran this by my expert within IBM and he wrote to me: First of all a "PIT job" such as restripe, deldisk, delsnapshot, and such should be easy to stop by ^C the management program that started them. The SG manager daemon holds open a socket to the client program for the purposes of sending command output, progress updates, error messages and the like. The PIT code checks this socket periodically and aborts the PIT process cleanly if the socket is closed. If this cleanup doesn't occur, it is a bug and should be worth reporting. However, there's no exact guarantee on how quickly each thread on the SG mgr will notice and then how quickly the helper nodes can be stopped and so forth. The interval between socket checks depends among other things on how long it takes to process each file, if there are a few very large files, the delay can be significant. In the limiting case, where most of the FS storage is contained in a few files, this mechanism doesn't work [elided] well. So it can be quite involved and slow sometimes to wrap up a PIT operation. The simplest way to determine if the command has really stopped is with the mmdiag --commands issued on the SG manager node. This shows running commands with the command line, start time, socket, flags, etc. After ^Cing the client program, the entry here should linger for a while, then go away. When it exits you'll see an entry in the GPFS log file where it fails with err 50. If this doesn't stop the command after a while, it is worth looking into. If the command wasn't issued on the SG mgr node and you can't find the where the client command is running, the socket is still a useful hint. While tedious, it should be possible to trace this socket back to node where that command was originally run using netstat or equivalent. Poking around inside a GPFS internaldump will also provide clues; there should be an outstanding sgmMsgSGClientCmd command listed in the dump tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or similar. I'd like to warn the OP away from mmfsadm test pit. These commands are of course unsupported and unrecommended for any purpose (even internal test and development purposes, as far as I know). You are definitely working without a net there. When I was improving the integration between PIT and snapshot quiesce a few years ago, I looked into this and couldn't figure out how to (easily) make these stop and resume commands safe to use, so as far as I know they remain unsafe. The list command, however, is probably fairly okay; but it would probably be better to use mmfsadm saferdump pit. From: Aaron Knister To: Date: 08/15/2016 10:49 PM Subject: [gpfsug-discuss] mmfsadm test pit Sent by: gpfsug-discuss-bounces at spectrumscale.org I just discovered this interesting gem poking at mmfsadm: test pit fsname list|suspend|status|resume|stop [jobId] There have been times where I've kicked off a restripe and either intentionally or accidentally ctrl-c'd it only to realize that many times it's disappeared into the ether and is still running. The only way I've known so far to stop it is with a chgmgr. A far more painful instance happened when I ran a rebalance on an fs w/more than 31 nsds using more than 31 pit workers and hit *that* fun APAR which locked up access for a single filesystem to all 3.5k nodes. We spent 48 hours round the clock rebooting nodes as jobs drained to clear it up. I would have killed in that instance for a way to cancel the PIT job (the chmgr trick didn't work). It looks like you might actually be able to do this with mmfsadm, although how wise this is, I do not know (kinda curious about that). Here's an example. I kicked off a restripe and then ctrl-c'd it on a client node. Then ran these commands from the fs manager: root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 debug: statusListP D40E2C70 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop 785979015170 debug: statusListP 0 root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 debug: statusListP D4013E70 ... some time passes ... root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list debug: statusListP 0 Interesting. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 16 22:55:19 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 16 Aug 2016 17:55:19 -0400 Subject: [gpfsug-discuss] mmfsadm test pit In-Reply-To: References: Message-ID: Thanks Marc! That's incredibly helpful info. I'll uh, not use the test pit command :) -Aaron On 8/16/16 5:09 PM, Marc A Kaplan wrote: > I was surprised to read that Ctrl-C did not really kill restripe. It's > supposed to! If it doesn't that's a bug. > > I ran this by my expert within IBM and he wrote to me: > > First of all a "PIT job" such as restripe, deldisk, delsnapshot, and > such should be easy to stop by ^C the management program that started > them. The SG manager daemon holds open a socket to the client program > for the purposes of sending command output, progress updates, error > messages and the like. The PIT code checks this socket periodically and > aborts the PIT process cleanly if the socket is closed. If this cleanup > doesn't occur, it is a bug and should be worth reporting. However, > there's no exact guarantee on how quickly each thread on the SG mgr will > notice and then how quickly the helper nodes can be stopped and so > forth. The interval between socket checks depends among other things on > how long it takes to process each file, if there are a few very large > files, the delay can be significant. In the limiting case, where most > of the FS storage is contained in a few files, this mechanism doesn't > work [elided] well. So it can be quite involved and slow sometimes to > wrap up a PIT operation. > > The simplest way to determine if the command has really stopped is with > the mmdiag --commands issued on the SG manager node. This shows running > commands with the command line, start time, socket, flags, etc. After > ^Cing the client program, the entry here should linger for a while, then > go away. When it exits you'll see an entry in the GPFS log file where > it fails with err 50. If this doesn't stop the command after a while, > it is worth looking into. > > If the command wasn't issued on the SG mgr node and you can't find the > where the client command is running, the socket is still a useful hint. > While tedious, it should be possible to trace this socket back to node > where that command was originally run using netstat or equivalent. > Poking around inside a GPFS internaldump will also provide clues; there > should be an outstanding sgmMsgSGClientCmd command listed in the dump > tscomm section. Once you find it, just 'kill `pidof mmrestripefs` or > similar. > > I'd like to warn the OP away from mmfsadm test pit. These commands are > of course unsupported and unrecommended for any purpose (even internal > test and development purposes, as far as I know). You are definitely > working without a net there. When I was improving the integration > between PIT and snapshot quiesce a few years ago, I looked into this and > couldn't figure out how to (easily) make these stop and resume commands > safe to use, so as far as I know they remain unsafe. The list command, > however, is probably fairly okay; but it would probably be better to use > mmfsadm saferdump pit. > > > > > > From: Aaron Knister > To: > Date: 08/15/2016 10:49 PM > Subject: [gpfsug-discuss] mmfsadm test pit > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------------------------------------------------ > > > > I just discovered this interesting gem poking at mmfsadm: > > test pit fsname list|suspend|status|resume|stop [jobId] > > There have been times where I've kicked off a restripe and either > intentionally or accidentally ctrl-c'd it only to realize that many > times it's disappeared into the ether and is still running. The only way > I've known so far to stop it is with a chgmgr. > > A far more painful instance happened when I ran a rebalance on an fs > w/more than 31 nsds using more than 31 pit workers and hit *that* fun > APAR which locked up access for a single filesystem to all 3.5k nodes. > We spent 48 hours round the clock rebooting nodes as jobs drained to > clear it up. I would have killed in that instance for a way to cancel > the PIT job (the chmgr trick didn't work). It looks like you might > actually be able to do this with mmfsadm, although how wise this is, I > do not know (kinda curious about that). > > Here's an example. I kicked off a restripe and then ctrl-c'd it on a > client node. Then ran these commands from the fs manager: > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00 > debug: statusListP D40E2C70 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop > 785979015170 > debug: statusListP 0 > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01 > debug: statusListP D4013E70 > > ... some time passes ... > > root at loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list > debug: statusListP 0 > > Interesting. > > -Aaron > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Wed Aug 17 02:46:39 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 17 Aug 2016 01:46:39 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Aug 17 12:45:04 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 17 Aug 2016 11:45:04 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> Message-ID: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From volobuev at us.ibm.com Wed Aug 17 21:34:57 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Wed, 17 Aug 2016 13:34:57 -0700 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> References: <5F910253243E6A47B81A9A2EB424BBA101CC6514@NDJSMBX404.ndc.nasa.gov> <7BFE2D50-9AA9-4A78-A05A-08D5DEB0A2E1@nuance.com> Message-ID: Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From SAnderson at convergeone.com Wed Aug 17 22:11:25 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Wed, 17 Aug 2016 21:11:25 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Message-ID: <1471468285737.63407@convergeone.com> ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 14134 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.jpg Type: image/jpeg Size: 2593 bytes Desc: image003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 11635 bytes Desc: image005.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.png Type: image/png Size: 11505 bytes Desc: image007.png URL: From YARD at il.ibm.com Thu Aug 18 00:11:52 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 02:11:52 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: <1471468285737.63407@convergeone.com> References: <1471468285737.63407@convergeone.com> Message-ID: Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From SAnderson at convergeone.com Thu Aug 18 02:51:38 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 01:51:38 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive In-Reply-To: References: <1471468285737.63407@convergeone.com>, Message-ID: <1471485097896.49269@convergeone.com> ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards ________________________________ Yaron Daniel 94 Em Ha'Moshavot Rd [cid:_1_0DDE2A700DDE24DC007F6D32C2258012] Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 [sig] [RH_CertifiedSysAdmin_CMYK] [Linux on IBM Power Systems - Sales 2016] [IBM Spectrum Storage - Sales 2016] NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 1851 bytes Desc: ATT00001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00002.png Type: image/png Size: 14134 bytes Desc: ATT00002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00003.jpg Type: image/jpeg Size: 2593 bytes Desc: ATT00003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00004.png Type: image/png Size: 11635 bytes Desc: ATT00004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00005.png Type: image/png Size: 11505 bytes Desc: ATT00005.png URL: From YARD at il.ibm.com Thu Aug 18 04:56:50 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Thu, 18 Aug 2016 06:56:50 +0300 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: So - the procedure you are asking related to Samba. Please check at redhat Site the process of upgrade Samba - u will need to backup the tdb files and restore them. But pay attention that the Samba ids will remain the same after moving to CES - please review the Authentication Section. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: gpfsug main discussion list Date: 08/18/2016 04:52 AM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 14134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11635 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 11505 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Thu Aug 18 15:47:25 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 18 Aug 2016 14:47:25 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? Message-ID: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: on behalf of Yuri L Volobuev Reply-To: gpfsug main discussion list Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" To: gpfsug main discussion list , Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" Reply-To: gpfsug main discussion list Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From bbanister at jumptrading.com Thu Aug 18 16:00:21 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 18 Aug 2016 15:00:21 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri [nactive hide details for "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---]"Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 106 bytes Desc: image001.gif URL: From mimarsh2 at vt.edu Thu Aug 18 16:15:38 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Thu, 18 Aug 2016 11:15:38 -0400 Subject: [gpfsug-discuss] NSD Server BIOS setting - snoop mode Message-ID: All, Is there any best practice or recommendation for the Snoop Mode memory setting for NSD Servers? Default is Early Snoop. On compute nodes, I am using Cluster On Die, which creates 2 NUMA nodes per processor. This setup has 2 x 16-core Broadwell processors in each NSD server. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmcpheeters at anl.gov Thu Aug 18 16:14:11 2016 From: gmcpheeters at anl.gov (McPheeters, Gordon) Date: Thu, 18 Aug 2016 15:14:11 +0000 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Got my vote - thanks Robert. Gordon McPheeters ALCF Storage (630) 252-6430 gmcpheeters at anl.gov On Aug 18, 2016, at 10:00 AM, Bryan Banister > wrote: Great stuff? I added my vote, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Oesterlin, Robert Sent: Thursday, August 18, 2016 9:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Done. Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) ID: 93260 Headline: Give sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics Submitted on: 18 Aug 2016, 10:46 AM Eastern Time (ET) Brand: Servers and Systems Software Product: Spectrum Scale (formerly known as GPFS) - Public RFEs Link: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid 507-269-0413 From: > on behalf of Yuri L Volobuev > Reply-To: gpfsug main discussion list > Date: Wednesday, August 17, 2016 at 3:34 PM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? Unfortunately, at the moment there's no safe mechanism to show the usage statistics for different NSD queues. "mmfsadm saferdump nsd" as implemented doesn't acquire locks when parsing internal data structures. Now, NSD data structures are fairly static, as much things go, so the risk of following a stale pointer and hitting a segfault isn't particularly significant. I don't think I remember ever seeing mmfsd crash with NSD dump code on the stack. That said, this isn't code that's tested and known to be safe for production use. I haven't seen a case myself where an mmfsd thread gets stuck running this dump command, either, but Bob has. If that condition ever reoccurs, I'd be interested in seeing debug data. I agree that there's value in giving a sysadmin insight into the inner workings of the NSD server machinery, in particular the queue dynamics. mmdiag should be enhanced to allow this. That'd be a very reasonable (and doable) RFE. yuri "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latenc From: "Oesterlin, Robert" > To: gpfsug main discussion list >, Date: 08/17/2016 04:45 AM Subject: Re: [gpfsug-discuss] Monitor NSD server queue? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ Hi Aaron You did a perfect job of explaining a situation I've run into time after time - high latency on the disk subsystem causing a backup in the NSD queues. I was doing what you suggested not to do - "mmfsadm saferdump nsd' and looking at the queues. In my case 'mmfsadm saferdump" would usually work or hang, rather than kill mmfsd. But - the hang usually resulted it a tied up thread in mmfsd, so that's no good either. I wish I had better news - this is the only way I've found to get visibility to these queues. IBM hasn't seen fit to gives us a way to safely look at these. I personally think it's a bug that we can't safely dump these structures, as they give insight as to what's actually going on inside the NSD server. Yuri, Sven - thoughts? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: > on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" > Reply-To: gpfsug main discussion list > Date: Tuesday, August 16, 2016 at 8:46 PM To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? Hi Everyone, We ran into a rather interesting situation over the past week. We had a job that was pounding the ever loving crap out of one of our filesystems (called dnb02) doing about 15GB/s of reads. We had other jobs experience a slowdown on a different filesystem (called dnb41) that uses entirely separate backend storage. What I can't figure out is why this other filesystem was affected. I've checked IB bandwidth and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth congestion, looked at the mmpmon nsd_ds counters (including disk request wait time), and checked out the disk iowait values from collectl. I simply can't account for the slowdown on the other filesystem. The only thing I can think of is the high latency on dnb02's NSDs caused the mmfsd NSD queues to back up. Here's my question-- how can I monitor the state of th NSD queues? I can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the queues and their status. I'm just not sure calling saferdump NSD every 10 seconds to monitor this data is going to end well. I've seen saferdump NSD cause mmfsd to die and that's from a task we only run every 6 hours that calls saferdump NSD. Any thoughts/ideas here would be great. Thanks! -Aaron_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 18:50:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 10:50:12 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: <1471485097896.49269@convergeone.com> References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From SAnderson at convergeone.com Thu Aug 18 19:11:02 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 18 Aug 2016 18:11:02 +0000 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From Kevin.Buterbaugh at Vanderbilt.Edu Thu Aug 18 20:05:03 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Thu, 18 Aug 2016 19:05:03 +0000 Subject: [gpfsug-discuss] Please ignore - debugging an issue Message-ID: Please ignore. I am working with the list admins on an issue and need to send an e-mail to the list to duplicate the problem. I apologize that this necessitates this e-mail to the list. Thanks? ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Aug 18 20:43:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 18 Aug 2016 12:43:50 -0700 Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive In-Reply-To: References: <1471468285737.63407@convergeone.com>, <1471485097896.49269@convergeone.com> Message-ID: There are a few points to consider here: CES uses Samba in cluster mode with ctdb. That means that the tdb database is shared through ctdb on all protocol nodes, and the internal format is slightly different since it contains additional information for tracking the cross-node status of the individual records. Spectrum Scale officially supports the autorid module for internal id mapping. That approach is different than the older idmap_tdb since it basically only has one record per AD domain, and not one record per user or group. This is known to scale better in environments where many users and groups require id mappings. The downside is that data from idmap_tdb cannot be directly imported. While not officially supported Spectrum Scale also ships the idmap_tdb module. You could configure authentication and internal id mapping on Spectrum Scale, and then overwrite the config manually to use the old idmap module (the idmap-range-size is required, but not relevant later on): mmuserauth service create ... --idmap-range 1000000-9000000 --idmap-range-size 100000 /usr/lpp/mmfs/bin/net conf setparm global 'idmap config * : backend' tdb mmdsh -N CesNodes systemctl restart gpfs-winbind mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush With the old Samba, export the idmap data to a file: net idmap dump > idmap-dump.txt And on a node running CES Samba import that data, and remove any old cached entries: /usr/lpp/mmfs/bin/net idmap restore idmap-dump.txt mmdsh -N CesNodes /usr/lpp/mmfs/bin/net cache flush Just to be clear: This is untested and if there is a problem with the id mapping in that configuration, it will likely be pointed to the unsupported configuration. The way to request this as an official feature would be through a RFE, although i cannot say whether that would be picked up by product management. Another option would be creating the id mappings in the Active Directory records or in a external LDAP server based on the old mappings, and point the CES Samba to that data. That would again be a supported configuration. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: Christof Schmitt/Tucson/IBM at IBMUS Cc: gpfsug main discussion list Date: 08/18/2016 11:11 AM Subject: RE: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Correct. We are upgrading their existing configuration and want to switch to CES provided Samba. They are using Samba 3.6.24 currently on RHEL 6.6. Here is the head of the smb.conf file: =================================================== [global] workgroup = SL1 netbios name = SLTLTFSEE server string = LTFSEE Server realm = removed.ORG security = ads encrypt passwords = yes default = global browseable = no socket options = TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15 idmap config * : backend = tdb idmap config * : range = 1000000-9000000 template shell = /bash/bin writable = yes allow trusted domains = yes client ntlmv2 auth = yes auth methods = guest sam winbind passdb backend = tdbsam groupdb:backend = tdb interfaces = eth1 lo username map = /etc/samba/smbusers map to guest = bad uid guest account = nobody ===================================================== Does that make sense? Regards, SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: Thursday, August 18, 2016 11:50 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE toSpectrumArchive Samba as supported in Spectrum Scale uses the "autorid" module for creating internal id mappings (see man idmap_autorid for some details). Officially supported are also methods to retrieve id mappings from an external server: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_adfofile.htm The earlier email states that they have a " .tdb backend for id mapping on their current server. ". How exactly is that configured in Samba? Which Samba version is used here? So the plan is to upgrade the cluster, and then switch to the Samba version provided with CES? Should the same id mappings be used? Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/17/2016 06:52 PM Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?We are currently running samba on the 3.5 node, but wanting to migrate everything into using CES once we get everything up to 4.2. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Yaron Daniel Sent: Wednesday, August 17, 2016 5:11 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Hi Do u use CES protocols nodes ? Or Samba on each of the Server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services- Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 08/18/2016 12:11 AM Subject: [gpfsug-discuss] Migrate 3.5 to 4.2 and LTFSEE to Spectrum Archive Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I am in process of migrating from 3.5 to 4.2 and LTFSEE to Spectrum Archive. 1 node cluster (currently) connected to V3700 storage and TS4500 backend. We have upgraded their 2nd node to 4.2 and have successfully tested joining the domain, created smb shares, and validated their ability to access and control permissions on those shares. They are using .tdb backend for id mapping on their current server. I'm looking to discuss with someone the best method of migrating those tdb databases to the second server, or understand how Spectrum Scale does id mapping and where it stores that information. Any hints would be greatly appreciated. Regards, SHAUN ANDERSON STORAGE ARCHITECT O208.577.2112 M214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. From jez.tucker at gpfsug.org Thu Aug 18 20:57:00 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Thu, 18 Aug 2016 20:57:00 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces Message-ID: Hi all As the discussion group is a mailing list, it is possible that members can experience the list traffic being interpreted as spam. In such instances, you may experience better results if you whitelist the mailing list addresses or create a 'Not Spam' filter (E.G. gmail) gpfsug-discuss at spectrumscale.org gpfsug-discuss at gpfsug.org You can test that you can receive a response from the mailing list server by sending an email to: gpfsug-discuss-request at spectrumscale.org with the subject of: help Should you experience further trouble, please ping us at: gpfsug-discuss-owner at spectrumscale.org All the best, Jez From aaron.s.knister at nasa.gov Fri Aug 19 05:12:26 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:12:26 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> Message-ID: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 gpfs server nodes and we've done several rounds of rolling upgrades starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of both versions for quite some time (We're actually in that state right now as it would happen and have been for several months). I've not seen any issue with it. Of course, as Richard alluded to, its good to check the release notes :) -Aaron On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: > Richard, > > I will second what Bob said with one caveat ? on one occasion we had an > issue with our multi-cluster setup because the PTF?s were incompatible. > However, that was clearly documented in the release notes, which we > obviously hadn?t read carefully enough. > > While we generally do rolling upgrades over a two to three week period, > we have run for months with clients at differing PTF levels. HTHAL? > > Kevin > >> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >> > wrote: >> >> In general, yes, it's common practice to do the 'rolling upgrades'. If >> I had to do my whole cluster at once, with an outage, I'd probably >> never upgrade. :) >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> *From: *> > on behalf of >> "Sobey, Richard A" > > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Monday, August 15, 2016 at 4:59 AM >> *To: *"'gpfsug-discuss at spectrumscale.org >> '" >> > > >> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >> problems? >> >> Hi all, >> >> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >> it over a few days, perhaps up to 2 weeks or will I run into problems >> if they?re on different versions? >> >> Cheers >> >> Richard >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu > - (615)875-9633 > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From aaron.s.knister at nasa.gov Fri Aug 19 05:13:06 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 00:13:06 -0400 Subject: [gpfsug-discuss] Minor GPFS versions coexistence problems? In-Reply-To: <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> References: <9691E717-690C-48C7-8017-BA6F001B5461@vanderbilt.edu> <140fab1a-e043-5c20-eb1f-d5ef7e91d89d@nasa.gov> Message-ID: <70e33e6d-cd6b-5a5e-1e2d-f0ad16def5f4@nasa.gov> Oops... I meant Kevin, not Richard. On 8/19/16 12:12 AM, Aaron Knister wrote: > Figured I'd throw in my "me too!" as well. We have ~3500 nodes and 60 > gpfs server nodes and we've done several rounds of rolling upgrades > starting with 3.5.0.19 -> 3.5.0.24. We've had the cluster with a mix of > both versions for quite some time (We're actually in that state right > now as it would happen and have been for several months). I've not seen > any issue with it. Of course, as Richard alluded to, its good to check > the release notes :) > > -Aaron > > On 8/15/16 8:45 AM, Buterbaugh, Kevin L wrote: >> Richard, >> >> I will second what Bob said with one caveat ? on one occasion we had an >> issue with our multi-cluster setup because the PTF?s were incompatible. >> However, that was clearly documented in the release notes, which we >> obviously hadn?t read carefully enough. >> >> While we generally do rolling upgrades over a two to three week period, >> we have run for months with clients at differing PTF levels. HTHAL? >> >> Kevin >> >>> On Aug 15, 2016, at 6:22 AM, Oesterlin, Robert >>> > >>> wrote: >>> >>> In general, yes, it's common practice to do the 'rolling upgrades'. If >>> I had to do my whole cluster at once, with an outage, I'd probably >>> never upgrade. :) >>> >>> >>> Bob Oesterlin >>> Sr Storage Engineer, Nuance HPC Grid >>> >>> >>> *From: *>> > on behalf of >>> "Sobey, Richard A" >> > >>> *Reply-To: *gpfsug main discussion list >>> >> > >>> *Date: *Monday, August 15, 2016 at 4:59 AM >>> *To: *"'gpfsug-discuss at spectrumscale.org >>> '" >>> >> > >>> *Subject: *[EXTERNAL] [gpfsug-discuss] Minor GPFS versions coexistence >>> problems? >>> >>> Hi all, >>> >>> If I wanted to upgrade my NSD nodes one at a time from 3.5.0.22 to >>> 3.5.0.27 (or whatever the latest in that branch is) am I ok to stagger >>> it over a few days, perhaps up to 2 weeks or will I run into problems >>> if they?re on different versions? >>> >>> Cheers >>> >>> Richard >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ? >> Kevin Buterbaugh - Senior System Administrator >> Vanderbilt University - Advanced Computing Center for Research and >> Education >> Kevin.Buterbaugh at vanderbilt.edu >> - (615)875-9633 >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bdeluca at gmail.com Fri Aug 19 05:15:00 2016 From: bdeluca at gmail.com (Ben De Luca) Date: Fri, 19 Aug 2016 07:15:00 +0300 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: Hey Jez, Its because the mailing list doesn't have an SPF record in your DNS, being neutral is a good way to be picked up as spam. On 18 August 2016 at 22:57, Jez Tucker wrote: > Hi all > > As the discussion group is a mailing list, it is possible that members can > experience the list traffic being interpreted as spam. > > > In such instances, you may experience better results if you whitelist the > mailing list addresses or create a 'Not Spam' filter (E.G. gmail) > > gpfsug-discuss at spectrumscale.org > > gpfsug-discuss at gpfsug.org > > > You can test that you can receive a response from the mailing list server by > sending an email to: gpfsug-discuss-request at spectrumscale.org with the > subject of: help > > > Should you experience further trouble, please ping us at: > gpfsug-discuss-owner at spectrumscale.org > > > All the best, > > > Jez > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From jez.tucker at gpfsug.org Fri Aug 19 08:51:20 2016 From: jez.tucker at gpfsug.org (Jez Tucker) Date: Fri, 19 Aug 2016 08:51:20 +0100 Subject: [gpfsug-discuss] If you are experiencing mail stuck in spam / bounces In-Reply-To: References: Message-ID: <0c9d81b2-ac41-b6a5-e4f1-a816558711b7@gpfsug.org> Hi Yes, we looked at that some time ago and I recall we had an issues with setting up the SPF. However, probably a good time as any to look at it again. I'll ping Arif and Simon and they can look at their respective domains. Jez On 19/08/16 05:15, Ben De Luca wrote: > Hey Jez, > Its because the mailing list doesn't have an SPF record in your > DNS, being neutral is a good way to be picked up as spam. > > > > On 18 August 2016 at 22:57, Jez Tucker wrote: >> Hi all >> >> As the discussion group is a mailing list, it is possible that members can >> experience the list traffic being interpreted as spam. >> >> >> In such instances, you may experience better results if you whitelist the >> mailing list addresses or create a 'Not Spam' filter (E.G. gmail) >> >> gpfsug-discuss at spectrumscale.org >> >> gpfsug-discuss at gpfsug.org >> >> >> You can test that you can receive a response from the mailing list server by >> sending an email to: gpfsug-discuss-request at spectrumscale.org with the >> subject of: help >> >> >> Should you experience further trouble, please ping us at: >> gpfsug-discuss-owner at spectrumscale.org >> >> >> All the best, >> >> >> Jez >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From aaron.s.knister at nasa.gov Fri Aug 19 23:06:57 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Fri, 19 Aug 2016 18:06:57 -0400 Subject: [gpfsug-discuss] Monitor NSD server queue? In-Reply-To: <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> References: <2702740E-EC6A-4998-BA1A-35A1EF5B5EDC@nuance.com> <21BC488F0AEA2245B2C3E83FC0B33DBB062FC26E@CHI-EXCHANGEW1.w2k.jumptrading.com> <97F08A04-D7C4-4985-840F-DC026E8606F4@anl.gov> Message-ID: <5ca238de-bb95-2854-68bd-36d1b8df2810@nasa.gov> Thanks everyone! I also have a PMR open for this, so hopefully the RFE gets some traction. On 8/18/16 11:14 AM, McPheeters, Gordon wrote: > Got my vote - thanks Robert. > > > Gordon McPheeters > ALCF Storage > (630) 252-6430 > gmcpheeters at anl.gov > > > >> On Aug 18, 2016, at 10:00 AM, Bryan Banister >> > wrote: >> >> Great stuff? I added my vote, >> -Bryan >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] *On >> Behalf Of *Oesterlin, Robert >> *Sent:* Thursday, August 18, 2016 9:47 AM >> *To:* gpfsug main discussion list >> *Subject:* Re: [gpfsug-discuss] Monitor NSD server queue? >> >> Done. >> >> Notification generated at: 18 Aug 2016, 10:46 AM Eastern Time (ET) >> >> ID: 93260 >> Headline: Give sysadmin insight >> into the inner workings of the NSD server machinery, in particular the >> queue dynamics >> Submitted on: 18 Aug 2016, 10:46 AM Eastern >> Time (ET) >> Brand: Servers and Systems >> Software >> Product: Spectrum Scale (formerly >> known as GPFS) - Public RFEs >> >> Link: >> http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=93260 >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> 507-269-0413 >> >> >> *From: *> > on behalf of Yuri L >> Volobuev > >> *Reply-To: *gpfsug main discussion list >> > > >> *Date: *Wednesday, August 17, 2016 at 3:34 PM >> *To: *gpfsug main discussion list > > >> *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Monitor NSD server queue? >> >> >> Unfortunately, at the moment there's no safe mechanism to show the >> usage statistics for different NSD queues. "mmfsadm saferdump nsd" as >> implemented doesn't acquire locks when parsing internal data >> structures. Now, NSD data structures are fairly static, as much things >> go, so the risk of following a stale pointer and hitting a segfault >> isn't particularly significant. I don't think I remember ever seeing >> mmfsd crash with NSD dump code on the stack. That said, this isn't >> code that's tested and known to be safe for production use. I haven't >> seen a case myself where an mmfsd thread gets stuck running this dump >> command, either, but Bob has. If that condition ever reoccurs, I'd be >> interested in seeing debug data. >> >> I agree that there's value in giving a sysadmin insight into the inner >> workings of the NSD server machinery, in particular the queue >> dynamics. mmdiag should be enhanced to allow this. That'd be a very >> reasonable (and doable) RFE. >> >> yuri >> >> "Oesterlin, Robert" ---08/17/2016 04:45:30 AM---Hi Aaron >> You did a perfect job of explaining a situation I've run into time >> after time - high latenc >> >> From: "Oesterlin, Robert" > > >> To: gpfsug main discussion list > >, >> Date: 08/17/2016 04:45 AM >> Subject: Re: [gpfsug-discuss] Monitor NSD server queue? >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> Hi Aaron >> >> You did a perfect job of explaining a situation I've run into time >> after time - high latency on the disk subsystem causing a backup in >> the NSD queues. I was doing what you suggested not to do - "mmfsadm >> saferdump nsd' and looking at the queues. In my case 'mmfsadm >> saferdump" would usually work or hang, rather than kill mmfsd. But - >> the hang usually resulted it a tied up thread in mmfsd, so that's no >> good either. >> >> I wish I had better news - this is the only way I've found to get >> visibility to these queues. IBM hasn't seen fit to gives us a way to >> safely look at these. I personally think it's a bug that we can't >> safely dump these structures, as they give insight as to what's >> actually going on inside the NSD server. >> >> Yuri, Sven - thoughts? >> >> >> Bob Oesterlin >> Sr Storage Engineer, Nuance HPC Grid >> >> >> >> *From: *> > on behalf of >> "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" >> >* >> Reply-To: *gpfsug main discussion list >> > >* >> Date: *Tuesday, August 16, 2016 at 8:46 PM* >> To: *gpfsug main discussion list > >* >> Subject: *[EXTERNAL] [gpfsug-discuss] Monitor NSD server queue? >> >> Hi Everyone, >> >> We ran into a rather interesting situation over the past week. We had >> a job that was pounding the ever loving crap out of one of our >> filesystems (called dnb02) doing about 15GB/s of reads. We had other >> jobs experience a slowdown on a different filesystem (called dnb41) >> that uses entirely separate backend storage. What I can't figure out >> is why this other filesystem was affected. I've checked IB bandwidth >> and congestion, Fibre channel bandwidth and errors, Ethernet bandwidth >> congestion, looked at the mmpmon nsd_ds counters (including disk >> request wait time), and checked out the disk iowait values from >> collectl. I simply can't account for the slowdown on the other >> filesystem. The only thing I can think of is the high latency on >> dnb02's NSDs caused the mmfsd NSD queues to back up. >> >> Here's my question-- how can I monitor the state of th NSD queues? I >> can't find anything in mmdiag. An mmfsadm saferdump NSD shows me the >> queues and their status. I'm just not sure calling saferdump NSD every >> 10 seconds to monitor this data is going to end well. I've seen >> saferdump NSD cause mmfsd to die and that's from a task we only run >> every 6 hours that calls saferdump NSD. >> >> Any thoughts/ideas here would be great. >> >> Thanks! >> >> -Aaron_______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> ------------------------------------------------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged >> information. If you are not the intended recipient, you are hereby >> notified that any review, dissemination or copying of this email is >> strictly prohibited, and to please notify the sender immediately and >> destroy this email and any attachments. Email transmission cannot be >> guaranteed to be secure or error-free. The Company, therefore, does >> not make any guarantees as to the completeness or accuracy of this >> email or any attachments. This email is for informational purposes >> only and does not constitute a recommendation, offer, request or >> solicitation of any kind to buy, sell, subscribe, redeem or perform >> any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From r.sobey at imperial.ac.uk Mon Aug 22 12:59:16 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 11:59:16 +0000 Subject: [gpfsug-discuss] CES and mmuserauth command Message-ID: Hi all, We're just about to start testing a new CES 4.2.0 cluster and at the stage of "joining" the cluster to our AD. What's the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I've re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I'm using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Aug 22 14:28:01 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 22 Aug 2016 13:28:01 +0000 Subject: [gpfsug-discuss] CES mmsmb options Message-ID: Related to my previous question in so far as it's to do with CES, what's this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static... for example log size / location / dmapi support? I'm surely missing something obvious. It's SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Tue Aug 23 00:30:10 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Mon, 22 Aug 2016 16:30:10 -0700 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Looks like there is a per export and a global listing. These are values that can be set per export : /usr/lpp/mmfs/bin/mmsmb export change --key-info supported Supported smb options with allowed values: admin users = any // any valid user browseable = yes, no comment = any // A free text description of the export. csc policy = manual, disable, documents, programs fileid:algorithm = fsname, hostname, fsname_nodirs, fsname_norootdir gpfs:leases = yes, no gpfs:recalls = yes, no gpfs:sharemodes = yes, no gpfs:syncio = yes, no hide unreadable = yes, no oplocks = yes, no posix locking = yes, no read only = yes, no smb encrypt = auto, default, mandatory, disabled syncops:onclose = yes, no These are the values that are set globally: /usr/lpp/mmfs/bin/mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 23 03:23:40 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Mon, 22 Aug 2016 22:23:40 -0400 Subject: [gpfsug-discuss] GPFS FPO Message-ID: Does anyone have any experiences to share (good or bad) about setting up and utilizing FPO for hadoop compute on top of GPFS? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 23 03:37:00 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 22 Aug 2016 22:37:00 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Yes, indeed. Note that these are my personal opinions. It seems to work quite well and it's not terribly hard to set up or get running. That said, if you've got a traditional HPC cluster with reasonably good bandwidth (and especially if your data is already on the HPC cluster) I wouldn't bother with FPO and just use something like magpie (https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on your traditional HPC cluster. I believe FPO (and by extension data locality) is important when the available bandwidth between your clients and servers/disks (in a traditional GPFS environment) is less than the bandwidth available within a node (e.g. between your local disks and the host CPU). -Aaron On 8/22/16 10:23 PM, Brian Marshall wrote: > Does anyone have any experiences to share (good or bad) about setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From mimarsh2 at vt.edu Tue Aug 23 12:56:22 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 23 Aug 2016 07:56:22 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: Aaron, Do you have experience running this on native GPFS? The docs say Lustre and any NFS filesystem. Thanks, Brian On Aug 22, 2016 10:37 PM, "Aaron Knister" wrote: > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or get > running. That said, if you've got a traditional HPC cluster with reasonably > good bandwidth (and especially if your data is already on the HPC cluster) > I wouldn't bother with FPO and just use something like magpie ( > https://github.com/LLNL/magpie) to run your hadoopy workload on GPFS on > your traditional HPC cluster. I believe FPO (and by extension data > locality) is important when the available bandwidth between your clients > and servers/disks (in a traditional GPFS environment) is less than the > bandwidth available within a node (e.g. between your local disks and the > host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > >> Does anyone have any experiences to share (good or bad) about setting up >> and utilizing FPO for hadoop compute on top of GPFS? >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 23 13:15:24 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 23 Aug 2016 14:15:24 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: > Hi all, > > > > We?re just about to start testing a new CES 4.2.0 cluster and at the stage > of ?joining? the cluster to our AD. What?s the bare minimum we need to get > going with this? My Windows guy (who is more Linux but whatever) has > suggested the following: > > > > mmuserauth service create --type ad --data-access-method file > > --netbios-name store --user-name USERNAME --password > > --enable-nfs-kerberos --enable-kerberos > > --servers list,of,servers > > --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 > --unixmap-domains 'DOMAIN(500 - 2000000)' > > > > He has also asked what the following is: > > > > --idmap-role ??? > > --idmap-range-size ?? > > > > All our LDAP GID/UIDs are coming from a system outside of GPFS so do we > leave this blank, or say master Or, now I?ve re-read and mmuserauth page, > is this purely for when you have AFM relationships and one GPFS cluster > (the subordinate / the second cluster) gets its UIDs and GIDs from another > GPFS cluster (the master / the first one)? > > > > For idmap-range-size is this essentially the highest number of users and > groups you can have defined within Spectrum Scale? (I love how I?m using > GPFS and SS interchangeably.. forgive me!) > > > > Many thanks > > > > Richard > > > > > > Richard Sobey > > Storage Area Network (SAN) Analyst > Technical Operations, ICT > Imperial College London > South Kensington > 403, City & Guilds Building > London SW7 2AZ > Tel: +44 (0)20 7594 6915 > Email: r.sobey at imperial.ac.uk > http://www.imperial.ac.uk/admin-services/ict/ > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Aug 23 14:58:17 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 13:58:17 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: In one of my file systems, I have some odd entries that seem to not be associated with a user - any ideas on the cause or how to track these down? This is a snippet from mmprepquota: Block Limits | File Limits Name type KB quota limit in_doubt grace | files quota limit in_doubt grace 2751555824 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348898617 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 2348895209 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 1610682073 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 536964752 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none 403325529 USR 0 1073741824 5368709120 0 none | 0 0 0 0 none Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:06:50 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:06:50 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: References: Message-ID: <1471961210.30100.88.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 13:58 +0000, Oesterlin, Robert wrote: > In one of my file systems, I have some odd entries that seem to not be > associated with a user - any ideas on the cause or how to track these > down? This is a snippet from mmprepquota: > > > > Block Limits > | File Limits > > Name type KB quota limit in_doubt > grace | files quota limit in_doubt grace > > 2751555824 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348898617 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 2348895209 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 1610682073 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 536964752 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > > 403325529 USR 0 1073741824 5368709120 0 > none | 0 0 0 0 none > I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From Robert.Oesterlin at nuance.com Tue Aug 23 15:10:22 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 23 Aug 2016 14:10:22 +0000 Subject: [gpfsug-discuss] Odd entries in quota listing Message-ID: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Well - good idea, but these large numbers in no way reflect valid ID numbers in our environment. Wondering how they got there? Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of Jonathan Buzzard Reply-To: gpfsug main discussion list Date: Tuesday, August 23, 2016 at 9:06 AM To: "gpfsug-discuss at spectrumscale.org" Subject: [EXTERNAL] Re: [gpfsug-discuss] Odd entries in quota listing I am guessing they are quotas that have been set for users that are now deleted. GPFS stores the quota for a user under their UID, and deleting the user and all their data is not enough to remove the entry from the quota reporting, you also have to delete their quota. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at buzzard.me.uk Tue Aug 23 15:16:05 2016 From: jonathan at buzzard.me.uk (Jonathan Buzzard) Date: Tue, 23 Aug 2016 15:16:05 +0100 Subject: [gpfsug-discuss] Odd entries in quota listing In-Reply-To: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> References: <93B0F53A-4ECD-4527-A67D-DD6C9B00F8E7@nuance.com> Message-ID: <1471961765.30100.90.camel@buzzard.phy.strath.ac.uk> On Tue, 2016-08-23 at 14:10 +0000, Oesterlin, Robert wrote: > Well - good idea, but these large numbers in no way reflect valid ID > numbers in our environment. Wondering how they got there? > I was guessing generating UID's from Windows RID's? Alternatively some script generated them automatically and the UID's are bogus. You can create a quota for any random UID and GPFS won't complain. JAB. -- Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk Fife, United Kingdom. From aaron.s.knister at nasa.gov Wed Aug 24 17:43:56 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Wed, 24 Aug 2016 12:43:56 -0400 Subject: [gpfsug-discuss] GPFS FPO In-Reply-To: References: Message-ID: <6f5a7284-c910-bbda-5e53-7f78e4289ad9@nasa.gov> To tell you the truth, I don't. It's on my radar but I haven't done it yet. I *have* run hadoop on GPFS w/o magpie though and on only a couple of nodes was able to pound 1GB/s out to GPFS w/ the terasort benchmark. I know our GPFS FS can go much faster than that but java was cpu-bound as it often seems to be. -Aaron On 8/23/16 7:56 AM, Brian Marshall wrote: > Aaron, > > Do you have experience running this on native GPFS? The docs say Lustre > and any NFS filesystem. > > Thanks, > Brian > > > On Aug 22, 2016 10:37 PM, "Aaron Knister" > wrote: > > Yes, indeed. Note that these are my personal opinions. > > It seems to work quite well and it's not terribly hard to set up or > get running. That said, if you've got a traditional HPC cluster with > reasonably good bandwidth (and especially if your data is already on > the HPC cluster) I wouldn't bother with FPO and just use something > like magpie (https://github.com/LLNL/magpie > ) to run your hadoopy workload on > GPFS on your traditional HPC cluster. I believe FPO (and by > extension data locality) is important when the available bandwidth > between your clients and servers/disks (in a traditional GPFS > environment) is less than the bandwidth available within a node > (e.g. between your local disks and the host CPU). > > -Aaron > > On 8/22/16 10:23 PM, Brian Marshall wrote: > > Does anyone have any experiences to share (good or bad) about > setting up > and utilizing FPO for hadoop compute on top of GPFS? > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From SAnderson at convergeone.com Thu Aug 25 17:32:48 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Thu, 25 Aug 2016 16:32:48 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752@convergeone.com> ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:47:00 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:47:00 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Aug 25 17:50:20 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 25 Aug 2016 16:50:20 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BFD5@CHI-EXCHANGEW1.w2k.jumptrading.com> I realize this was totally tangential to your question. Sorry I can?t help with the syntax, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Thursday, August 25, 2016 11:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From taylorm at us.ibm.com Thu Aug 25 17:55:44 2016 From: taylorm at us.ibm.com (Michael L Taylor) Date: Thu, 25 Aug 2016 09:55:44 -0700 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: Not sure where mmcessmbchconfig command is coming from? mmsmb is the proper CLI syntax [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb Usage: mmsmb export Administer SMB exports. mmsmb exportacl Administer SMB export ACLs. mmsmb config Administer SMB global configuration. [root at smaug-vm1 installer]# /usr/lpp/mmfs/bin/mmsmb export -h Usage: mmsmb export list List SMB exports. mmsmb export add Add SMB exports. mmsmb export change Change SMB exports. mmsmb export remove Remove SMB exports. [root at smaug-vm1 installer]# man mmsmb http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_mmsmb.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweil at wustl.edu Thu Aug 25 19:50:52 2016 From: mweil at wustl.edu (Matt Weil) Date: Thu, 25 Aug 2016 13:50:52 -0500 Subject: [gpfsug-discuss] Backup on object stores Message-ID: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From billowen at us.ibm.com Thu Aug 25 20:55:33 2016 From: billowen at us.ibm.com (Bill Owen) Date: Thu, 25 Aug 2016 12:55:33 -0700 Subject: [gpfsug-discuss] Backup on object stores In-Reply-To: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> References: <5cc4ae43-2d0f-e548-b256-84f1890fe2d3@wustl.edu> Message-ID: Hi Matt, With Spectrum Scale object storage, you can create storage policies, and then assign containers to those policies. Each policy will map to a GPFS independent fileset. That way, you can subdivide object storage and manage different types of objects based on the type of data stored in the container/storage policy (i.e., back up some types of object data nightly, some weekly, some not at all). Today, we don't have a cli to simplify to restoring individual objects. But using commands like swift-get-nodes, you can determine the filesystem path to an object, and then restore only that item. And if you are using storage policies with file & object access enabled, you can access the object/files by file path directly. Regards, Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 From: Matt Weil To: Date: 08/25/2016 11:51 AM Subject: [gpfsug-discuss] Backup on object stores Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello all, Just brain storming here mainly but want to know how you are all approaching this. Do you replicate using GPFS and forget about backups? > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adv_osbackup.htm This seems good for a full recovery but what if I just lost one object? Seems if objectizer is in use then both tivoli and space management can be used on the file. Thanks in advance for your responses. Matt ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Greg.Lehmann at csiro.au Fri Aug 26 00:14:57 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Thu, 25 Aug 2016 23:14:57 +0000 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <1472142769455.35752@convergeone.com> <21BC488F0AEA2245B2C3E83FC0B33DBB0630BF86@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <156b078bfb2d48d8b77d5250dba7e928@exch1-cdc.nexus.csiro.au> I agree with an RFE. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Bryan Banister Sent: Friday, 26 August 2016 2:47 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmcessmbchconfig command My general rule is that if there isn?t a man page or ?-h? option to explain the usage of the command, then it isn?t meant to be run by an user administrator. I wish that the commands that should never be run by a user admin (or without direction from IBM support) would be put in a different directory that clearly indicated they are for internal GPFS use. RFE worthy? Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Shaun Anderson Sent: Thursday, August 25, 2016 11:33 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] mmcessmbchconfig command ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From syi at ca.ibm.com Fri Aug 26 00:15:46 2016 From: syi at ca.ibm.com (Yi Sun) Date: Thu, 25 Aug 2016 19:15:46 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: References: Message-ID: You may check mmsmb command, not sure if it is what you look for. https://www.ibm.com/support/knowledgecenter/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.adm.doc/bl1adm_mmsmb.htm#mmsmb ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- From: Shaun Anderson To: gpfsug main discussion list Subject: [gpfsug-discuss] mmcessmbchconfig command Message-ID: <1472142769455.35752 at convergeone.com> Content-Type: text/plain; charset="iso-8859-1" ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: To clarify and expand on some of these: --servers takes the AD Domain Controller that is contacted first during configuration. Later and during normal operations the list of DCs is retrieved from DNS and the fastest (or closest one according to the AD sites) is used. The initially one used does not have a special role. --idmap-role allows dedicating one cluster as a master, and a second cluster (e.g. a AFM replication target) as "subordinate". Only the master will allocate idmap ranges which can then be imported to the subordiate to have consistent id mappings. --idmap-range-size and --idmap-range are used for the internal idmap allocation which is used for every domain that is not explicitly using another domain. "man idmap_autorid" explains the approach taken. As long as the default does not overlap with any other ids, that can be used. The "netbios" name is used to create the machine account for the cluster when joining the AD domain. That is how the AD administrator will identify the CES cluster. It is also important in SMB deployments when Kerberos should be used with SMB: The same names as the netbios name has to be defined in DNS for the public CES IP addresses. When the name matches, then SMB clients can acquire a Kerberos ticket from AD to establish a SMB connection. When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/23/2016 08:15 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry to see no authoritative answers yet.. I'm doing lots of CES installations, but have not quite yet gotten the full understanding of this.. Simple stuff first: --servers You can only have one with AD. --enable-kerberos shouldn't be used, as that's only for LDAP according to the documentation. Guess kerberos is implied with AD. --idmap-role -- I've been using "master". Man-page says ID map role of a stand?alone or singular system deployment must be selected "master" What the idmap options seems to be doing is configure the idmap options for Samba. Maybe best explained by: https://wiki.samba.org/index.php/Idmap_config_ad Your suggested options will then give you the samba idmap configuration: idmap config * : rangesize = 1000000 idmap config * : range = 3000000-3500000 idmap config * : read only = no idmap:cache = no idmap config * : backend = autorid idmap config DOMAIN : schema_mode = rfc2307 idmap config DOMAIN : range = 500-2000000 idmap config DOMAIN : backend = ad Most likely you want to replace DOMAIN by your AD domain name.. So the --idmap options sets some defaults, that you probably won't care about, since all your users are likely covered by the specific "idmap config DOMAIN" config. Hope this helps somewhat, now I'll follow up with something I'm wondering myself...: Is the netbios name just a name, without any connection to anything in AD? Is the --user-name/--password a one-time used account that's only necessary when executing the mmuserauth command, or will it also be for communication between CES and AD while the services are running? -jf On Mon, Aug 22, 2016 at 1:59 PM, Sobey, Richard A wrote: Hi all, We?re just about to start testing a new CES 4.2.0 cluster and at the stage of ?joining? the cluster to our AD. What?s the bare minimum we need to get going with this? My Windows guy (who is more Linux but whatever) has suggested the following: mmuserauth service create --type ad --data-access-method file --netbios-name store --user-name USERNAME --password --enable-nfs-kerberos --enable-kerberos --servers list,of,servers --idmap-range-size 1000000 --idmap-range 3000000 - 3500000 --unixmap-domains 'DOMAIN(500 - 2000000)' He has also asked what the following is: --idmap-role ??? --idmap-range-size ?? All our LDAP GID/UIDs are coming from a system outside of GPFS so do we leave this blank, or say master Or, now I?ve re-read and mmuserauth page, is this purely for when you have AFM relationships and one GPFS cluster (the subordinate / the second cluster) gets its UIDs and GIDs from another GPFS cluster (the master / the first one)? For idmap-range-size is this essentially the highest number of users and groups you can have defined within Spectrum Scale? (I love how I?m using GPFS and SS interchangeably.. forgive me!) Many thanks Richard Richard Sobey Storage Area Network (SAN) Analyst Technical Operations, ICT Imperial College London South Kensington 403, City & Guilds Building London SW7 2AZ Tel: +44 (0)20 7594 6915 Email: r.sobey at imperial.ac.uk http://www.imperial.ac.uk/admin-services/ict/ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:49:12 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:49:12 -0400 Subject: [gpfsug-discuss] mmcessmbchconfig command In-Reply-To: <1472142769455.35752@convergeone.com> References: <1472142769455.35752@convergeone.com> Message-ID: The mmcessmb* commands are scripts that are run from the corresponding mmsmb subcommands. mmsmb is documented and should be used instead of calling the mmcesmb* scripts directly. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Shaun Anderson To: gpfsug main discussion list Date: 08/25/2016 12:33 PM Subject: [gpfsug-discuss] mmcessmbchconfig command Sent by: gpfsug-discuss-bounces at spectrumscale.org ?I'm not seeing many of the 'mmces' commands listed and there is no man page for many of them. I'm specifically looking at the mmcessmbchconfig command and my syntax isn't being taken. Any insight? SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 00:52:50 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 25 Aug 2016 19:52:50 -0400 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From gaurang.tapase at in.ibm.com Fri Aug 26 08:53:12 2016 From: gaurang.tapase at in.ibm.com (Gaurang Tapase) Date: Fri, 26 Aug 2016 13:23:12 +0530 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Message-ID: Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.org or the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series , Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers , IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning , Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object , Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization , IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency , Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Fri Aug 26 09:17:55 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:17:55 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Thanks Christof, and for the detailed posting on the mmuserauth settings. I do not know why we have changed dmapi support in our existing smb.conf, but perhaps it was for some legacy stuff. Richard -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Christof Schmitt Sent: 26 August 2016 00:53 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES mmsmb options The options listed in " mmsmb config change --key-info supported" are supported to be changed by administrator of the cluster. "mmsmb config list" lists the whole Samba config, including the options that are set internally. We do not want to support any random Samba configuration, hence the line between "supported" option and everything else. If there is a usecase that requires other Samba options than the ones listed as "supported", one way forward would be opening a RFE that describes the usecase and the Samba option to support it. Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/22/2016 09:28 AM Subject: [gpfsug-discuss] CES mmsmb options Sent by: gpfsug-discuss-bounces at spectrumscale.org Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From r.sobey at imperial.ac.uk Fri Aug 26 09:48:24 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 26 Aug 2016 08:48:24 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: Sorry all, prepare for a deluge of emails like this, hopefully it'll help other people implementing CES in the future. I'm trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it's not running but it seems to be blocking me. It's happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Fri Aug 26 10:48:18 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 09:48:18 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That was a weird one :-) Don't understand why NFS would block smb.., and I don't see that on my cluster. Would it make sense to suspend the node instead? As a workaround. mmces node suspend -jf fre. 26. aug. 2016 kl. 10.48 skrev Sobey, Richard A : > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > > > I?m trying to stop SMB on a node, but getting the following output: > > > > [root at cesnode ~]# mmces service stop smb > > smb: Request denied. Please stop NFS first > > > > [root at cesnode ~]# mmces service list > > Enabled services: SMB > > SMB is running > > > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > > > Richard > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From konstantin.arnold at unibas.ch Fri Aug 26 10:56:28 2016 From: konstantin.arnold at unibas.ch (Konstantin Arnold) Date: Fri, 26 Aug 2016 11:56:28 +0200 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: <57C0124C.7050404@unibas.ch> Hi Richard, I ran into the same issue and asked if 'systemctl reload gpfs-smb.service' would work? I got the following answer: "... Now in regards to your question about stopping NFS, yes this is an expected behavior and yes you could also restart through systemctl." Maybe that helps. Konstantin From janfrode at tanso.net Fri Aug 26 10:59:34 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Fri, 26 Aug 2016 11:59:34 +0200 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" To: "'gpfsug-discuss at spectrumscale.org'" Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From christof.schmitt at us.ibm.com Fri Aug 26 17:29:31 2016 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 26 Aug 2016 12:29:31 -0400 Subject: [gpfsug-discuss] CES and mmuserauth command In-Reply-To: References: Message-ID: The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From dacalder at co.ibm.com Sat Aug 27 13:52:44 2016 From: dacalder at co.ibm.com (Danny Alexander Calderon Rodriguez) Date: Sat, 27 Aug 2016 12:52:44 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: Message-ID: Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone > El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: > > Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at spectrumscale.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at spectrumscale.org > > You can reach the person managing the list at > gpfsug-discuss-owner at spectrumscale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) > 2. Re: CES and mmuserauth command (Christof Schmitt) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Message-ID: > > > Content-Type: text/plain; charset="UTF-8" > > That would be the case when Active Directory is configured for > authentication. In that case the SMB service includes two aspects: One is > the actual SMB file server, and the second one is the service for the > Active Directory integration. Since NFS depends on authentication and id > mapping services, it requires SMB to be running. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > > Date: 08/26/2016 04:48 AM > Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Sorry all, prepare for a deluge of emails like this, hopefully it?ll help > other people implementing CES in the future. > > I?m trying to stop SMB on a node, but getting the following output: > > [root at cesnode ~]# mmces service stop smb > smb: Request denied. Please stop NFS first > > [root at cesnode ~]# mmces service list > Enabled services: SMB > SMB is running > > As you can see there is no way to stop NFS when it?s not running but it > seems to be blocking me. It?s happening on all the nodes in the cluster. > > SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. > > Richard_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > ------------------------------ > > Message: 2 > Date: Fri, 26 Aug 2016 12:29:31 -0400 > From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Message-ID: > > > Content-Type: text/plain; charset="ISO-2022-JP" > > The --user-name option applies to both, AD and LDAP authentication. In the > LDAP case, this information is correct. I will try to get some > clarification added for the AD case. > > The same applies to the information shown in "service list". There is a > common field that holds the information and the parameter from the initial > "service create" is stored there. The meaning is different for AD and > LDAP: For LDAP it is the username being used to access the LDAP server, > while in the AD case it was only the user initially used until the machine > account was created. > > Regards, > > Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ > christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) > > > > From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM > Subject: Re: [gpfsug-discuss] CES and mmuserauth command > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < > christof.schmitt at us.ibm.com> wrote: > > When joinging the AD domain, --user-name, --password and --server are only > used to initially identify and logon to the AD and to create the machine > account for the cluster. Once that is done, that information is no longer > used, and e.g. the account from --user-name could be deleted, the password > changed or the specified DC could be removed from the domain (as long as > other DCs are remaining). > > > That was my initial understanding of the --user-name, but when reading the > man-page I get the impression that it's also used to do connect to AD to > do user and group lookups: > > ------------------------------------------------------------------------------------------------------ > ??user?name userName > Specifies the user name to be used to perform operations > against the authentication server. The specified user > name must have sufficient permissions to read user and > group attributes from the authentication server. > ------------------------------------------------------------------------------------------------------- > > Also it's strange that "mmuserauth service list" would list the USER_NAME > if it was only somthing that was used at configuration time..? > > > > -jf_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 55, Issue 44 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Sat Aug 27 20:06:45 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Sat, 27 Aug 2016 19:06:45 +0000 Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? In-Reply-To: References: Message-ID: Hi, Thanks for the info! I think I?ll perform an upgrade to 4.2.1, the cluster is still in a pre-production state and I?ve yet to really start testing client access. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Danny Alexander Calderon Rodriguez Sent: 27 August 2016 13:53 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Hi Richard This is fixed in release 4.2.1, if you cant upgrade now, you can fix this manuallly Just do this. edit file /usr/lpp/mmfs/lib/mmcesmon/SMBService.py Change if authType == 'ad' and not nodeState.nfsStopped: to nfsEnabled = utils.isProtocolEnabled("NFS", self.logger) if authType == 'ad' and not nodeState.nfsStopped and nfsEnabled: You need to stop the gpfs service in each node where you apply the change " after change the lines please use tap key" Enviado desde mi iPhone El 27/08/2016, a las 6:00 a.m., gpfsug-discuss-request at spectrumscale.org escribi?: Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Cannot stop SMB... stop NFS first?(Christof Schmitt) 2. Re: CES and mmuserauth command (Christof Schmitt) ---------------------------------------------------------------------- Message: 1 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Message-ID: > Content-Type: text/plain; charset="UTF-8" That would be the case when Active Directory is configured for authentication. In that case the SMB service includes two aspects: One is the actual SMB file server, and the second one is the service for the Active Directory integration. Since NFS depends on authentication and id mapping services, it requires SMB to be running. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: "Sobey, Richard A" > To: "'gpfsug-discuss at spectrumscale.org'" > Date: 08/26/2016 04:48 AM Subject: [gpfsug-discuss] Cannot stop SMB... stop NFS first? Sent by: gpfsug-discuss-bounces at spectrumscale.org Sorry all, prepare for a deluge of emails like this, hopefully it?ll help other people implementing CES in the future. I?m trying to stop SMB on a node, but getting the following output: [root at cesnode ~]# mmces service stop smb smb: Request denied. Please stop NFS first [root at cesnode ~]# mmces service list Enabled services: SMB SMB is running As you can see there is no way to stop NFS when it?s not running but it seems to be blocking me. It?s happening on all the nodes in the cluster. SS version is 4.2.0 running on a fully up to date RHEL 7.1 server. Richard_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ Message: 2 Date: Fri, 26 Aug 2016 12:29:31 -0400 From: "Christof Schmitt" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] CES and mmuserauth command Message-ID: > Content-Type: text/plain; charset="ISO-2022-JP" The --user-name option applies to both, AD and LDAP authentication. In the LDAP case, this information is correct. I will try to get some clarification added for the AD case. The same applies to the information shown in "service list". There is a common field that holds the information and the parameter from the initial "service create" is stored there. The meaning is different for AD and LDAP: For LDAP it is the username being used to access the LDAP server, while in the AD case it was only the user initially used until the machine account was created. Regards, Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ christof.schmitt at us.ibm.com || +1-520-799-2469 (T/L: 321-2469) From: Jan-Frode Myklebust > To: gpfsug main discussion list > Date: 08/26/2016 05:59 AM Subject: Re: [gpfsug-discuss] CES and mmuserauth command Sent by: gpfsug-discuss-bounces at spectrumscale.org On Fri, Aug 26, 2016 at 1:49 AM, Christof Schmitt < christof.schmitt at us.ibm.com> wrote: When joinging the AD domain, --user-name, --password and --server are only used to initially identify and logon to the AD and to create the machine account for the cluster. Once that is done, that information is no longer used, and e.g. the account from --user-name could be deleted, the password changed or the specified DC could be removed from the domain (as long as other DCs are remaining). That was my initial understanding of the --user-name, but when reading the man-page I get the impression that it's also used to do connect to AD to do user and group lookups: ------------------------------------------------------------------------------------------------------ ??user?name userName Specifies the user name to be used to perform operations against the authentication server. The specified user name must have sufficient permissions to read user and group attributes from the authentication server. ------------------------------------------------------------------------------------------------------- Also it's strange that "mmuserauth service list" would list the USER_NAME if it was only somthing that was used at configuration time..? -jf_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 44 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Lehmann at csiro.au Mon Aug 29 00:57:21 2016 From: Greg.Lehmann at csiro.au (Greg.Lehmann at csiro.au) Date: Sun, 28 Aug 2016 23:57:21 +0000 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <57496841ec784222b5e291a921280c38@exch1-cdc.nexus.csiro.au> It would be nice if the Spectrum Scale User Group website had links to these, perhaps a separate page for blogs links. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Gaurang Tapase Sent: Friday, 26 August 2016 5:53 PM To: gpfsug main discussion list Cc: Sandeep Ramesh Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale Hello, On Request from Bob Oesterlin, we post these links on User Group - Here are the latest publications and Blogs on Spectrum Scale. We encourage the User Group to follow the Spectrum Scale blogs on the http://storagecommunity.orgor the Usergroup admin to register the email group of the feeds. A total of 25 recent Blogs on IBM Spectrum Scale by developers IBM Spectrum Scale Security IBM Spectrum Scale: Security Blog Series http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series, Spectrum Scale Security Blog Series: Introduction, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-introduction IBM Spectrum Scale Security: VLANs and Protocol nodes, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-vlans-and-protocol-nodes IBM Spectrum Scale Security: Firewall Overview http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-firewall-overview IBM Spectrum Scale Security Blog Series: Security with Spectrum Scale OpenStack Storage Drivers http://storagecommunity.org/easyblog/entry/security-with-spectrum-scale-openstack-storage-drivers, IBM Spectrum Scale Security Blog Series: Authorization http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-authorization IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale Security: Secure Data at Rest, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-secure-data-at-rest IBM Spectrum Scale Security Blog Series: Secure Data in Transit, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-blog-series-secure-data-in-transit-1 IBM Spectrum Scale Security Blog Series: Sudo based Secure Administration and Admin Command Logging, http://storagecommunity.org/easyblog/entry/spectrum-scale-security-blog-series-sudo-based-secure-administration-and-admin-command-logging IBM Spectrum Scale Security: Security Features of Transparent Cloud Tiering (TCT), http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-security-features-of-transparent-cloud-tiering-tct IBM Spectrum Scale: Immutability, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-immutability IBM Spectrum Scale : FILE protocols authentication http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-file-protocols-authentication IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, IBM Spectrum Scale Security: Anti-Virus bulk scanning, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-security-anti-virus-bulk-scanning, Spectrum Scale 4.2.1 - What's New http://storagecommunity.org/easyblog/entry/spectrum-scale-4-2-1-what-s-new IBM Spectrum Scale 4.2.1 : diving deeper, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-diving-deeper NEW DEMO: Using IBM Cloud Object Storage as IBM Spectrum Scale Transparent Cloud Tier, http://storagecommunity.org/easyblog/entry/new-demo-using-ibm-cloud-object-storage-as-ibm-spectrum-scale-transparent-cloud-tier Spectrum Scale transparent cloud tiering, http://storagecommunity.org/easyblog/entry/spectrum-scale-transparent-cloud-tiering Spectrum Scale in Wonderland - Introducing transparent cloud tiering with Spectrum Scale 4.2.1, http://storagecommunity.org/easyblog/entry/spectrum-scale-in-wonderland, Spectrum Scale Object Related Blogs IBM Spectrum Scale 4.2.1 - What's new in Object, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-4-2-1-what-s-new-in-object, Hot cakes or hot objects, they better be served fast http://storagecommunity.org/easyblog/entry/hot-cakes-or-hot-objects-they-better-be-served-fast IBM Spectrum Scale: Object (OpenStack Swift, S3) Authorization, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-object-openstack-swift-s3-authorization, IBM Spectrum Scale : Object Authentication, http://storagecommunity.org/easyblog/entry/protocol-authentication-object, Spectrum Scale BD&A IBM Spectrum Scale: new features of HDFS Transparency, http://storagecommunity.org/easyblog/entry/ibm-spectrum-scale-new-features-of-hdfs-transparency, Regards, ------------------------------------------------------------------------ Gaurang S Tapase Spectrum Scale & OpenStack Development IBM India Storage Lab, Pune (India) Email : gaurang.tapase at in.ibm.com Phone : +91-20-42025699 (W), +91-9860082042(Cell) ------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglasof at us.ibm.com Mon Aug 29 06:34:03 2016 From: douglasof at us.ibm.com (Douglas O'flaherty) Date: Sun, 28 Aug 2016 22:34:03 -0700 Subject: [gpfsug-discuss] Edge Attendees Message-ID: Greetings: I am organizing an NDA round-table with the IBM Offering Managers at IBM Edge on Tuesday, September 20th at 1pm. The subject will be "The Future of IBM Spectrum Scale." IBM Offering Managers are the Product Owners at IBM. There will be discussions covering licensing, the roadmap for IBM Spectrum Scale RAID (aka GNR), new hardware platforms, etc. This is a unique opportunity to get feedback to the drivers of the IBM Spectrum Scale business plans. It should be a great companion to the content we get from Engineering and Research at most User Group meetings. To get an invitation, please email me privately at douglasof us.ibm.com. All who have a valid NDA are invited. I only need an approximate headcount of attendees. Try not to spam the mailing list. I am pushing to get the Offering Managers to have a similar session at SC16 as an IBM Multi-client Briefing. You can add your voice to that call on this thread, or email me directly. Spectrum Scale User Group at SC16 will once again take place on Sunday afternoon with cocktails to follow. I hope we can blow out the attendance numbers and the number of site speakers we had last year! I know Simon, Bob, and Kristy are already working the agenda. Get your ideas in to them or to me. See you in Vegas, Vegas, SLC, Vegas this Fall... Maybe Australia in between? doug Douglas O'Flaherty IBM Spectrum Storage Marketing -------------- next part -------------- An HTML attachment was scrubbed... URL: From stef.coene at docum.org Mon Aug 29 07:39:05 2016 From: stef.coene at docum.org (Stef Coene) Date: Mon, 29 Aug 2016 08:39:05 +0200 Subject: [gpfsug-discuss] Blogs and publications on Spectrum Scale In-Reply-To: References: Message-ID: <9bb8d52e-86a3-3ff7-daaf-dc6bf0a3bd82@docum.org> Hi, When trying to register on the website, I each time get the error: "Session expired. Please try again later." Stef From kraemerf at de.ibm.com Mon Aug 29 08:20:46 2016 From: kraemerf at de.ibm.com (Frank Kraemer) Date: Mon, 29 Aug 2016 09:20:46 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Hi all, In the last months several customers were asking for the option to use multiple IBM Spectrum Protect servers to protect a single IBM Spectrum Scale file system. Some of these customer reached the server scalability limits, others wanted to increase the parallelism of the server housekeeping processes. In consideration of the significant grow of data it can be assumed that more and more customers will be faced with this challenge in the future. Therefore, this paper was written that helps to address this situation. This paper describes the setup and configuration of multiple IBM Spectrum Protect servers to be used to store backup and hsm data of a single IBM Spectrum Scale file system. Beside the setup and configuration several best practices were written to the paper that help to simplify the daily use and administration of such environments. Find the paper here: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection A big THANK YOU goes to my co-writers Thomas Schreiber and Patrick Luft for their important input and all the tests (...and re-tests and re-tests and re-tests :-) ) they did. ...please share in your communities. Greetings, Dominic. ______________________________________________________________________________________________________________ Dominic Mueller-Wicke | IBM Spectrum Protect Development | Technical Lead | +49 7034 64 32794 | dominic.mueller at de.ibm.com Vorsitzende des Aufsichtsrats: Martina Koederitz; Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen; Registergericht: Amtsgericht Stuttgart, HRB 243294 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 18:33:59 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:33:59 -0400 Subject: [gpfsug-discuss] iowait? Message-ID: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Hi Everyone, Would it be easy to have GPFS report iowait values in linux? This would be a huge help for us in determining whether a node's low utilization is due to some issue with the code running on it or if it's blocked on I/O, especially in a historical context. I naively tried on a test system changing schedule() in cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: again: /* call the scheduler */ if ( waitFlags & INTERRUPTIBLE ) schedule(); else io_schedule(); Seems to actually do what I'm after but generally bad things happen when I start pretending I'm a kernel developer. Any thoughts? If I open an RFE would this be something that's relatively easy to implement (not asking for a commitment *to* implement it, just that I'm not asking for something seemingly simple that's actually fairly hard to implement)? -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From chekh at stanford.edu Mon Aug 29 18:50:23 2016 From: chekh at stanford.edu (Alex Chekholko) Date: Mon, 29 Aug 2016 10:50:23 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> Message-ID: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Any reason you can't just use iostat or collectl or any of a number of other standards tools to look at disk utilization? On 08/29/2016 10:33 AM, Aaron Knister wrote: > Hi Everyone, > > Would it be easy to have GPFS report iowait values in linux? This would > be a huge help for us in determining whether a node's low utilization is > due to some issue with the code running on it or if it's blocked on I/O, > especially in a historical context. > > I naively tried on a test system changing schedule() in > cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: > > again: > /* call the scheduler */ > if ( waitFlags & INTERRUPTIBLE ) > schedule(); > else > io_schedule(); > > Seems to actually do what I'm after but generally bad things happen when > I start pretending I'm a kernel developer. > > Any thoughts? If I open an RFE would this be something that's relatively > easy to implement (not asking for a commitment *to* implement it, just > that I'm not asking for something seemingly simple that's actually > fairly hard to implement)? > > -Aaron > -- Alex Chekholko chekh at stanford.edu From aaron.s.knister at nasa.gov Mon Aug 29 18:54:12 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 13:54:12 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> Message-ID: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This would >> be a huge help for us in determining whether a node's low utilization is >> due to some issue with the code running on it or if it's blocked on I/O, >> especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen when >> I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's relatively >> easy to implement (not asking for a commitment *to* implement it, just >> that I'm not asking for something seemingly simple that's actually >> fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 18:56:25 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 17:56:25 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> There is the iohist data that may have what you're looking for, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 12:54 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. On 8/29/16 1:50 PM, Alex Chekholko wrote: > Any reason you can't just use iostat or collectl or any of a number of > other standards tools to look at disk utilization? > > On 08/29/2016 10:33 AM, Aaron Knister wrote: >> Hi Everyone, >> >> Would it be easy to have GPFS report iowait values in linux? This >> would be a huge help for us in determining whether a node's low >> utilization is due to some issue with the code running on it or if >> it's blocked on I/O, especially in a historical context. >> >> I naively tried on a test system changing schedule() in >> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >> >> again: >> /* call the scheduler */ >> if ( waitFlags & INTERRUPTIBLE ) >> schedule(); >> else >> io_schedule(); >> >> Seems to actually do what I'm after but generally bad things happen >> when I start pretending I'm a kernel developer. >> >> Any thoughts? If I open an RFE would this be something that's >> relatively easy to implement (not asking for a commitment *to* >> implement it, just that I'm not asking for something seemingly simple >> that's actually fairly hard to implement)? >> >> -Aaron >> > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From olaf.weiser at de.ibm.com Mon Aug 29 19:02:38 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Mon, 29 Aug 2016 20:02:38 +0200 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Mon Aug 29 19:04:32 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:04:32 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number of >> other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly simple >>> that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:06:36 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:06:36 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From aaron.s.knister at nasa.gov Mon Aug 29 19:09:36 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 14:09:36 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Mon Aug 29 19:11:05 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 29 Aug 2016 18:11:05 +0000 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063147A9@CHI-EXCHANGEW1.w2k.jumptrading.com> That's a good question, but I don't expect it should cause you much of a problem. Of course testing and trying to measure any impact would be wise, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:10 PM To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] iowait? Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, >> -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From sfadden at us.ibm.com Mon Aug 29 20:33:14 2016 From: sfadden at us.ibm.com (Scott Fadden) Date: Mon, 29 Aug 2016 12:33:14 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <5f563924-61bb-9623-aa84-02d97bd8f379@nasa.gov> Message-ID: There is a known performance issue that can possibly cause longer than expected network time-outs if you are running iohist too often. So be careful it is best to collect it as a sample, instead of all of the time. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale From: Aaron Knister To: Date: 08/29/2016 11:09 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Nice! Thanks Bryan. I wonder what the implications are of setting it to something high enough that we could capture data every 10s. I figure if 512 events only takes me to 1 second I would need to log in the realm of 10k to capture every 10 seconds and account for spikes in I/O. -Aaron On 8/29/16 2:06 PM, Bryan Banister wrote: > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Aug 29 20:37:13 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 19:37:13 +0000 Subject: [gpfsug-discuss] CES mmsmb options In-Reply-To: References: Message-ID: Hi Richard, You can of course change any of the other options with the "net conf" (/usr/lpp/mmfs/bin/net conf) command. As its just stored in the Samba registry. Of course whether or not you end up with a supported configuration is a different matter... When we first rolled out CES/SMB, there were a number of issues with setting it up in the way we needed for our environment (AD for auth, LDAP for identity) which at the time wasn't available through the config tools. I believe this has now changed though I haven't gone back and "reset" our configs. Simon ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sobey, Richard A [r.sobey at imperial.ac.uk] Sent: 22 August 2016 14:28 To: 'gpfsug-discuss at spectrumscale.org' Subject: [gpfsug-discuss] CES mmsmb options Related to my previous question in so far as it?s to do with CES, what?s this all about: [root at ces]# mmsmb config change --key-info supported Supported smb options with allowed values: gpfs:dfreequota = yes, no restrict anonymous = 0, 2 server string = any mmsmb config list shows many more options. Are they static? for example log size / location / dmapi support? I?m surely missing something obvious. It?s SS 4.2.0 btw. Thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From usa-principal at gpfsug.org Mon Aug 29 21:13:51 2016 From: usa-principal at gpfsug.org (Spectrum Scale Users Group - USA Principal Kristy Kallback-Rose) Date: Mon, 29 Aug 2016 16:13:51 -0400 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event Message-ID: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Hello, I know many of you may be planning your SC16 schedule already. We wanted to give you a heads up that a Spectrum Scale (GPFS) Users Group event is being planned. The event will be much like last year?s event with a combination of technical updates and user experiences and thus far is loosely planned for: Sunday (11/13) ~12p - ~5 PM with a social hour after the meeting. We hope to see you there. More details as planning progresses. Best, Kristy & Bob From S.J.Thompson at bham.ac.uk Mon Aug 29 21:27:28 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Mon, 29 Aug 2016 20:27:28 +0000 Subject: [gpfsug-discuss] SC16 Hold the Date - Spectrum Scale (GPFS) Users Group Event In-Reply-To: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> References: <648FFF79-343D-447E-9CC5-4E0199C29572@gpfsug.org> Message-ID: You may also be interested in a panel session on the Friday of SC16: http://sc16.supercomputing.org/presentation/?id=pan120&sess=sess185 This isn't a user group event, but part of the technical programme for SC16, though I'm sure you will recognise some of the names from the storage community. Moderator: Simon Thompson (me) Panel: Sven Oehme (IBM Research) James Coomer (DDN) Sage Weil (RedHat/CEPH) Colin Morey (Hartree/STFC) Pam Gilman (NCAR) Martin Gasthuber (DESY) Friday 8:30 - 10:00 Simon From volobuev at us.ibm.com Mon Aug 29 21:31:17 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 13:31:17 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: I would advise caution on using "mmdiag --iohist" heavily. In more recent code streams (V4.1, V4.2) there's a problem with internal locking that could, under certain conditions could lead to the symptoms that look very similar to sporadic network blockage. Basically, if "mmdiag --iohist" gets blocked for long periods of time (e.g. due to local disk/NFS performance issues), this may end up blocking an mmfsd receiver thread, delaying RPC processing. The problem was discovered fairly recently, and the fix hasn't made it out to all service streams yet. More generally, IO history is a valuable tool for troubleshooting disk IO performance issues, but the tool doesn't have the right semantics for regular, systemic IO performance sampling and monitoring. The query operation is too expensive, the coverage is subject to load, and the output is somewhat unstructured. With some effort, one can still build some form of a roll-your-own monitoring implement, but this is certainly not an optimal way of approaching the problem. The data should be available in a structured form, through a channel that supports light-weight, flexible querying that doesn't impact mainline IO processing. In Spectrum Scale, this type of data is fed from mmfsd to Zimon, via an mmpmon interface, and end users can then query Zimon for raw or partially processed data. Where it comes to high-volume stats, retaining raw data at its full resolution is only practical for relatively short periods of time (seconds, or perhaps a small number of minutes), and some form of aggregation is necessary for covering longer periods of time (hours to days). In the current versions of the product, there's a very similar type of data available this way: RPC stats. There are plans to make IO history data available in a similar fashion. The entire approach may need to be re-calibrated, however. Making RPC stats available doesn't appear to have generated a surge of user interest. This is probably because the data is too complex for casual processing, and while without doubt a lot of very valuable insight can be gained by analyzing RPC stats, the actual effort required to do so is too much for most users. That is, we need to provide some tools for raw data analytics. Largely the same argument applies to IO stats. In fact, on an NSD client IO stats are actually a subset of RPC stats. With some effort, one can perform a comprehensive analysis of NSD client IO stats by analyzing NSD client-to-server RPC traffic. One can certainly argue that the effort required is a bit much though. Getting back to the original question: would the proposed cxiWaitEventWait () change work? It'll likely result in nr_iowait being incremented every time a thread in GPFS code performs an uninterruptible wait. This could be an act of performing an actual IO request, or something else, e.g. waiting for a lock. Those may be the desirable semantics in some scenarios, but I wouldn't agree that it's the right behavior for any uninterruptible wait. io_schedule() is intended for use for block device IO waits, so using it this way is not in line with the code intent, which is never a good idea. Besides, relative to schedule(), io_schedule() has some overhead that could have performance implications of an uncertain nature. yuri From: Bryan Banister To: gpfsug main discussion list , Date: 08/29/2016 11:06 AM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Try this: mmchconfig ioHistorySize=1024 # Or however big you want! Cheers, -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [ mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Monday, August 29, 2016 1:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] iowait? That's an interesting idea. I took a look at mmdig --iohist on a busy node it doesn't seem to capture more than literally 1 second of history. Is there a better way to grab the data or have gpfs capture more of it? Just to give some more context, as part of our monthly reporting requirements we calculate job efficiency by comparing the number of cpu cores requested by a given job with the cpu % utilization during that job's time window. Currently a job that's doing a sleep 9000 would show up the same as a job blocked on I/O. Having GPFS wait time included in iowait would allow us to easily make this distinction. -Aaron On 8/29/16 1:56 PM, Bryan Banister wrote: > There is the iohist data that may have what you're looking for, -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron > Knister > Sent: Monday, August 29, 2016 12:54 PM > To: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] iowait? > > Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. > > On 8/29/16 1:50 PM, Alex Chekholko wrote: >> Any reason you can't just use iostat or collectl or any of a number >> of other standards tools to look at disk utilization? >> >> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>> Hi Everyone, >>> >>> Would it be easy to have GPFS report iowait values in linux? This >>> would be a huge help for us in determining whether a node's low >>> utilization is due to some issue with the code running on it or if >>> it's blocked on I/O, especially in a historical context. >>> >>> I naively tried on a test system changing schedule() in >>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>> >>> again: >>> /* call the scheduler */ >>> if ( waitFlags & INTERRUPTIBLE ) >>> schedule(); >>> else >>> io_schedule(); >>> >>> Seems to actually do what I'm after but generally bad things happen >>> when I start pretending I'm a kernel developer. >>> >>> Any thoughts? If I open an RFE would this be something that's >>> relatively easy to implement (not asking for a commitment *to* >>> implement it, just that I'm not asking for something seemingly >>> simple that's actually fairly hard to implement)? >>> >>> -Aaron >>> >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight > Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From aaron.s.knister at nasa.gov Mon Aug 29 23:58:34 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Mon, 29 Aug 2016 18:58:34 -0400 Subject: [gpfsug-discuss] iowait? In-Reply-To: References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov> <7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu> <5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com> <7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov> <21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From volobuev at us.ibm.com Tue Aug 30 06:09:21 2016 From: volobuev at us.ibm.com (Yuri L Volobuev) Date: Mon, 29 Aug 2016 22:09:21 -0700 Subject: [gpfsug-discuss] iowait? In-Reply-To: <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> References: <546699b0-b939-9764-b047-d58007ba4a74@nasa.gov><7ec8c8b4-5f89-46fe-585d-c69964342a58@stanford.edu><5078d80d-8a15-892c-0db6-006f0350e0cc@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB063146F7@CHI-EXCHANGEW1.w2k.jumptrading.com><7dc7b4d8-502c-c691-5516-955fd6562e56@nasa.gov><21BC488F0AEA2245B2C3E83FC0B33DBB0631475C@CHI-EXCHANGEW1.w2k.jumptrading.com> <8ec95af4-4d30-a904-4ba2-cf253460754a@nasa.gov> Message-ID: I don't see a simple fix that can be implemented by tweaking a general-purpose low-level synchronization primitive. It should be possible to integrate GPFS better into the Linux IO accounting infrastructure, but that would require some investigation a likely a non-trivial amount of work to do right. yuri From: Aaron Knister To: , Date: 08/29/2016 03:59 PM Subject: Re: [gpfsug-discuss] iowait? Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks Yuri! I thought calling io_schedule was the right thing to do because the nfs client in the kernel did this directly until fairly recently. Now it calls wait_on_bit_io which I believe ultimately calls io_schedule. Do you see a more targeted approach for having GPFS register IO wait as something that's feasible? (e.g. not registering iowait for locks, as you suggested, but doing so for file/directory operations such as read/write/readdir?) -Aaron On 8/29/16 4:31 PM, Yuri L Volobuev wrote: > I would advise caution on using "mmdiag --iohist" heavily. In more > recent code streams (V4.1, V4.2) there's a problem with internal locking > that could, under certain conditions could lead to the symptoms that > look very similar to sporadic network blockage. Basically, if "mmdiag > --iohist" gets blocked for long periods of time (e.g. due to local > disk/NFS performance issues), this may end up blocking an mmfsd receiver > thread, delaying RPC processing. The problem was discovered fairly > recently, and the fix hasn't made it out to all service streams yet. > > More generally, IO history is a valuable tool for troubleshooting disk > IO performance issues, but the tool doesn't have the right semantics for > regular, systemic IO performance sampling and monitoring. The query > operation is too expensive, the coverage is subject to load, and the > output is somewhat unstructured. With some effort, one can still build > some form of a roll-your-own monitoring implement, but this is certainly > not an optimal way of approaching the problem. The data should be > available in a structured form, through a channel that supports > light-weight, flexible querying that doesn't impact mainline IO > processing. In Spectrum Scale, this type of data is fed from mmfsd to > Zimon, via an mmpmon interface, and end users can then query Zimon for > raw or partially processed data. Where it comes to high-volume stats, > retaining raw data at its full resolution is only practical for > relatively short periods of time (seconds, or perhaps a small number of > minutes), and some form of aggregation is necessary for covering longer > periods of time (hours to days). In the current versions of the product, > there's a very similar type of data available this way: RPC stats. There > are plans to make IO history data available in a similar fashion. The > entire approach may need to be re-calibrated, however. Making RPC stats > available doesn't appear to have generated a surge of user interest. > This is probably because the data is too complex for casual processing, > and while without doubt a lot of very valuable insight can be gained by > analyzing RPC stats, the actual effort required to do so is too much for > most users. That is, we need to provide some tools for raw data > analytics. Largely the same argument applies to IO stats. In fact, on an > NSD client IO stats are actually a subset of RPC stats. With some > effort, one can perform a comprehensive analysis of NSD client IO stats > by analyzing NSD client-to-server RPC traffic. One can certainly argue > that the effort required is a bit much though. > > Getting back to the original question: would the proposed > cxiWaitEventWait() change work? It'll likely result in nr_iowait being > incremented every time a thread in GPFS code performs an uninterruptible > wait. This could be an act of performing an actual IO request, or > something else, e.g. waiting for a lock. Those may be the desirable > semantics in some scenarios, but I wouldn't agree that it's the right > behavior for any uninterruptible wait. io_schedule() is intended for use > for block device IO waits, so using it this way is not in line with the > code intent, which is never a good idea. Besides, relative to > schedule(), io_schedule() has some overhead that could have performance > implications of an uncertain nature. > > yuri > > Inactive hide details for Bryan Banister ---08/29/2016 11:06:59 AM---Try > this: mmchconfig ioHistorySize=1024 # Or however big yBryan Banister > ---08/29/2016 11:06:59 AM---Try this: mmchconfig ioHistorySize=1024 # Or > however big you want! > > From: Bryan Banister > To: gpfsug main discussion list , > Date: 08/29/2016 11:06 AM > Subject: Re: [gpfsug-discuss] iowait? > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------------------------------------------------ > > > > Try this: > > mmchconfig ioHistorySize=1024 # Or however big you want! > > Cheers, > -Bryan > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org > [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister > Sent: Monday, August 29, 2016 1:05 PM > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] iowait? > > That's an interesting idea. I took a look at mmdig --iohist on a busy > node it doesn't seem to capture more than literally 1 second of history. > Is there a better way to grab the data or have gpfs capture more of it? > > Just to give some more context, as part of our monthly reporting > requirements we calculate job efficiency by comparing the number of cpu > cores requested by a given job with the cpu % utilization during that > job's time window. Currently a job that's doing a sleep 9000 would show > up the same as a job blocked on I/O. Having GPFS wait time included in > iowait would allow us to easily make this distinction. > > -Aaron > > On 8/29/16 1:56 PM, Bryan Banister wrote: >> There is the iohist data that may have what you're looking for, -Bryan >> >> -----Original Message----- >> From: gpfsug-discuss-bounces at spectrumscale.org >> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron >> Knister >> Sent: Monday, August 29, 2016 12:54 PM >> To: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] iowait? >> >> Sure, we can and we do use both iostat/sar and collectl to collect disk utilization on our nsd servers. That doesn't give us insight, though, into any individual client node of which we've got 3500. We do log mmpmon data from each node but that doesn't give us any insight into how much time is being spent waiting on I/O. Having GPFS report iowait on client nodes would give us this insight. >> >> On 8/29/16 1:50 PM, Alex Chekholko wrote: >>> Any reason you can't just use iostat or collectl or any of a number >>> of other standards tools to look at disk utilization? >>> >>> On 08/29/2016 10:33 AM, Aaron Knister wrote: >>>> Hi Everyone, >>>> >>>> Would it be easy to have GPFS report iowait values in linux? This >>>> would be a huge help for us in determining whether a node's low >>>> utilization is due to some issue with the code running on it or if >>>> it's blocked on I/O, especially in a historical context. >>>> >>>> I naively tried on a test system changing schedule() in >>>> cxiWaitEventWait() on line ~2832 in gpl-linux/cxiSystem.c to this: >>>> >>>> again: >>>> /* call the scheduler */ >>>> if ( waitFlags & INTERRUPTIBLE ) >>>> schedule(); >>>> else >>>> io_schedule(); >>>> >>>> Seems to actually do what I'm after but generally bad things happen >>>> when I start pretending I'm a kernel developer. >>>> >>>> Any thoughts? If I open an RFE would this be something that's >>>> relatively easy to implement (not asking for a commitment *to* >>>> implement it, just that I'm not asking for something seemingly >>>> simple that's actually fairly hard to implement)? >>>> >>>> -Aaron >>>> >>> >> >> -- >> Aaron Knister >> NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight >> Center >> (301) 286-2776 >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> ________________________________ >> >> Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) > only and may contain proprietary, confidential or privileged > information. If you are not the intended recipient, you are hereby > notified that any review, dissemination or copying of this email is > strictly prohibited, and to please notify the sender immediately and > destroy this email and any attachments. Email transmission cannot be > guaranteed to be secure or error-free. The Company, therefore, does not > make any guarantees as to the completeness or accuracy of this email or > any attachments. This email is for informational purposes only and does > not constitute a recommendation, offer, request or solicitation of any > kind to buy, sell, subscribe, redeem or perform any type of transaction > of a financial product. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Aug 30 09:34:33 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 08:34:33 +0000 Subject: [gpfsug-discuss] CES network aliases Message-ID: Hi all, It's Tuesday morning and that means question time :) So from http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_cesnetworkconfig.htm, I've extracted the following: How to use an alias To use an alias address for CES, you need to provide a static IP address that is not already defined as an alias in the /etc/sysconfig/network-scripts directory. Before you enable the node as a CES node, configure the network adapters for each subnet that are represented in the CES address pool: 1. Define a static IP address for the device: 2. /etc/sysconfig/network-scripts/ifcfg-eth0 3. DEVICE=eth1 4. BOOTPROTO=none 5. IPADDR=10.1.1.10 6. NETMASK=255.255.255.0 7. ONBOOT=yes 8. GATEWAY=10.1.1.1 TYPE=Ethernet 1. Ensure that there are no aliases that are defined in the network-scripts directory for this interface: 10.# ls -l /etc/sysconfig/network-scripts/ifcfg-eth1:* ls: /etc/sysconfig/network-scripts/ifcfg-eth1:*: No such file or directory After the node is enabled as a CES node, no further action is required. CES addresses are added as aliases to the already configured adapters. Now, does this mean for every floating (CES) IP address I need a separate ifcfg-ethX on each node? At the moment I simply have an ifcfg-X file representing each physical network adapter, and then the CES IPs defined. I can see IP addresses being added during failover to the primary interface, but now I've read I potentially need to create a separate file. What's the right way to move forward? If I need separate files, I presume the listed IP is a CES IP (not system) and does it also matter what X is in ifcfg-ethX? Many thanks Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Aug 30 10:54:31 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 30 Aug 2016 09:54:31 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Aug 30 11:30:25 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 30 Aug 2016 10:30:25 +0000 Subject: [gpfsug-discuss] CES network aliases In-Reply-To: References: Message-ID: Ace thanks jf. From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 30 August 2016 10:55 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] CES network aliases You only need a static address for your ifcfg-ethX on all nodes, and can then have CES manage multiple floating addresses in that subnet. Also, it doesn't matter much what your interfaces are named (ethX, vlanX, bondX, ethX.5), GPFS will just find the interface that covers the floating address in its subnet, and add the alias there. -jf -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 15:58:41 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 10:58:41 -0400 Subject: [gpfsug-discuss] Data Replication Message-ID: All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Tue Aug 30 16:03:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 15:03:38 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 17:16:37 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 12:16:37 -0400 Subject: [gpfsug-discuss] gpfs native raid Message-ID: Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From bbanister at jumptrading.com Tue Aug 30 17:26:38 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Tue, 30 Aug 2016 16:26:38 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB06316445@CHI-EXCHANGEW1.w2k.jumptrading.com> I believe that Doug is going to provide more details at the NDA session at Edge... see attached, -B -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Aaron Knister Sent: Tuesday, August 30, 2016 11:17 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. -------------- next part -------------- An embedded message was scrubbed... From: Douglas O'flaherty Subject: [gpfsug-discuss] Edge Attendees Date: Mon, 29 Aug 2016 05:34:03 +0000 Size: 9615 URL: From cdmaestas at us.ibm.com Tue Aug 30 17:47:18 2016 From: cdmaestas at us.ibm.com (Christopher Maestas) Date: Tue, 30 Aug 2016 16:47:18 +0000 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: Message-ID: Interestingly enough, Spectrum Scale can run on zvols. Check out: http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf -cdm On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: From: aaron.s.knister at nasa.gov To: gpfsug-discuss at spectrumscale.org Cc: Date: Aug 30, 2016 9:17:05 AM Subject: [gpfsug-discuss] gpfs native raid Does anyone know if/when we might see gpfs native raid opened up for the masses on non-IBM hardware? It's hard to answer the question of "why can't GPFS do this? Lustre can" in regards to Lustre's integration with ZFS and support for RAID on commodity hardware. -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Tue Aug 30 18:16:03 2016 From: aaron.s.knister at nasa.gov (Aaron Knister) Date: Tue, 30 Aug 2016 13:16:03 -0400 Subject: [gpfsug-discuss] gpfs native raid In-Reply-To: References: Message-ID: <96282850-6bfa-73ae-8502-9e8df3a56390@nasa.gov> Thanks Christopher. I've tried GPFS on zvols a couple times and the write throughput I get is terrible because of the required sync=always parameter. Perhaps a couple of SSD's could help get the number up, though. -Aaron On 8/30/16 12:47 PM, Christopher Maestas wrote: > Interestingly enough, Spectrum Scale can run on zvols. Check out: > > http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf > > -cdm > > ------------------------------------------------------------------------ > On Aug 30, 2016, 9:17:05 AM, aaron.s.knister at nasa.gov wrote: > > From: aaron.s.knister at nasa.gov > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: Aug 30, 2016 9:17:05 AM > Subject: [gpfsug-discuss] gpfs native raid > > Does anyone know if/when we might see gpfs native raid opened up for the > masses on non-IBM hardware? It's hard to answer the question of "why > can't GPFS do this? Lustre can" in regards to Lustre's integration with > ZFS and support for RAID on commodity hardware. > -Aaron > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 From laurence at qsplace.co.uk Tue Aug 30 19:50:51 2016 From: laurence at qsplace.co.uk (Laurence Horrocks-Barlow) Date: Tue, 30 Aug 2016 20:50:51 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: >The NSD Client handles the replication and will, as you stated, write >one copy to one NSD (using the primary server for this NSD) and one to >a different NSD in a different GPFS failure group (using quite likely, >but not necessarily, a different NSD server that is the primary server >for this alternate NSD). >Cheers, >-Bryan > >From: gpfsug-discuss-bounces at spectrumscale.org >[mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian >Marshall >Sent: Tuesday, August 30, 2016 9:59 AM >To: gpfsug main discussion list >Subject: [gpfsug-discuss] Data Replication > >All, > >If I setup a filesystem to have data replication of 2 (2 copies of >data), does the data get replicated at the NSD Server or at the client? >i.e. Does the client send 2 copies over the network or does the NSD >Server get a single copy and then replicate on storage NSDs? > >I couldn't find a place in the docs that talked about this specific >point. > >Thank you, >Brian Marshall > >________________________________ > >Note: This email is for the confidential use of the named addressee(s) >only and may contain proprietary, confidential or privileged >information. If you are not the intended recipient, you are hereby >notified that any review, dissemination or copying of this email is >strictly prohibited, and to please notify the sender immediately and >destroy this email and any attachments. Email transmission cannot be >guaranteed to be secure or error-free. The Company, therefore, does not >make any guarantees as to the completeness or accuracy of this email or >any attachments. This email is for informational purposes only and does >not constitute a recommendation, offer, request or solicitation of any >kind to buy, sell, subscribe, redeem or perform any type of transaction >of a financial product. > > >------------------------------------------------------------------------ > >_______________________________________________ >gpfsug-discuss mailing list >gpfsug-discuss at spectrumscale.org >http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Aug 30 19:52:54 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 30 Aug 2016 14:52:54 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: <21BC488F0AEA2245B2C3E83FC0B33DBB063161EE@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < laurence at qsplace.co.uk> wrote: > Its the client that does all the synchronous replication, this way the > cluster is able to scale as the clients do the leg work (so to speak). > > The somewhat "exception" is if a GPFS NSD server (or client with direct > NSD) access uses a server bases protocol such as SMB, in this case the SMB > server will do the replication as the SMB client doesn't know about GPFS or > its replication; essentially the SMB server is the GPFS client. > > -- Lauz > > On 30 August 2016 17:03:38 CEST, Bryan Banister > wrote: > >> The NSD Client handles the replication and will, as you stated, write one >> copy to one NSD (using the primary server for this NSD) and one to a >> different NSD in a different GPFS failure group (using quite likely, but >> not necessarily, a different NSD server that is the primary server for this >> alternate NSD). >> >> Cheers, >> >> -Bryan >> >> >> >> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss- >> bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >> *Sent:* Tuesday, August 30, 2016 9:59 AM >> *To:* gpfsug main discussion list >> *Subject:* [gpfsug-discuss] Data Replication >> >> >> >> All, >> >> >> >> If I setup a filesystem to have data replication of 2 (2 copies of data), >> does the data get replicated at the NSD Server or at the client? i.e. Does >> the client send 2 copies over the network or does the NSD Server get a >> single copy and then replicate on storage NSDs? >> >> >> >> I couldn't find a place in the docs that talked about this specific point. >> >> >> >> Thank you, >> >> Brian Marshall >> >> >> ------------------------------ >> >> Note: This email is for the confidential use of the named addressee(s) >> only and may contain proprietary, confidential or privileged information. >> If you are not the intended recipient, you are hereby notified that any >> review, dissemination or copying of this email is strictly prohibited, and >> to please notify the sender immediately and destroy this email and any >> attachments. Email transmission cannot be guaranteed to be secure or >> error-free. The Company, therefore, does not make any guarantees as to the >> completeness or accuracy of this email or any attachments. This email is >> for informational purposes only and does not constitute a recommendation, >> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >> or perform any type of transaction of a financial product. >> >> ------------------------------ >> >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Aug 30 20:09:05 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 30 Aug 2016 19:09:05 +0000 Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon From kevindjo at us.ibm.com Tue Aug 30 20:43:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 19:43:39 +0000 Subject: [gpfsug-discuss] greetings Message-ID: An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Tue Aug 30 21:39:18 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Tue, 30 Aug 2016 22:39:18 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek From kevindjo at us.ibm.com Tue Aug 30 21:51:39 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 30 Aug 2016 20:51:39 +0000 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: An HTML attachment was scrubbed... URL: From mark.bergman at uphs.upenn.edu Tue Aug 30 22:07:21 2016 From: mark.bergman at uphs.upenn.edu (mark.bergman at uphs.upenn.edu) Date: Tue, 30 Aug 2016 17:07:21 -0400 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: Your message of "Tue, 30 Aug 2016 22:39:18 +0200." <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> Message-ID: <24437-1472591241.445832@bR6O.TofS.917u> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman From xhejtman at ics.muni.cz Tue Aug 30 23:02:50 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 00:02:50 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: Message-ID: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek From oehmes at gmail.com Wed Aug 31 00:24:59 2016 From: oehmes at gmail.com (Sven Oehme) Date: Tue, 30 Aug 2016 16:24:59 -0700 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaron.s.knister at nasa.gov Wed Aug 31 05:00:45 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Wed, 31 Aug 2016 04:00:45 +0000 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Just want to add on to one of the points Sven touched on regarding metadata HW. We have a modest SSD infrastructure for our metadata disks and we can scan 500M inodes in parallel in about 5 hours if my memory serves me right (and I believe we could go faster if we really wanted to). I think having solid metadata disks (no pun intended) will really help with scan times. From: Sven Oehme Sent: 8/30/16, 7:25 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek > wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Wed Aug 31 05:52:57 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 31 Aug 2016 06:52:57 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> References: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" <5F910253243E6A47B81A9A2EB424BBA101CFF7DB@NDMSMBX404.ndc.nasa.gov> Message-ID: An HTML attachment was scrubbed... URL: From dominic.mueller at de.ibm.com Wed Aug 31 06:52:38 2016 From: dominic.mueller at de.ibm.com (Dominic Mueller-Wicke01) Date: Wed, 31 Aug 2016 07:52:38 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: Thanks for reading the paper. I agree that the restore of a large number of files is a challenge today. The restore is the focus area for future enhancements for the integration between IBM Spectrum Scale and IBM Spectrum Protect. If something will be available that helps to improve the restore capabilities the paper will be updated with this information. Greetings, Dominic. From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 31.08.2016 01:25 Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ----- Message from "Simon Thompson (Research Computing - IT Services)" on Tue, 30 Aug 2016 19:09:05 +0000 ----- To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 19:43:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings I'm in Lab Services at IBM - just joining and happy to help any way I can. Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Message from Lukas Hejtmanek on Tue, 30 Aug 2016 22:39:18 +0200 ----- To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ----- Message from "Kevin D Johnson" on Tue, 30 Aug 2016 20:51:39 +0000 ----- To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 RHEL 6.8/2.6.32-642 requires 4.1.1.8 or 4.2.1. You can either go to 6.7 for GPFS 3.5 or bump it up to 7.0/7.1. See Table 13, here: http://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html?view=kc#linuxq Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 720.349.6199 - kevindjo at us.ibm.com ----- Original message ----- From: Lukas Hejtmanek Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Date: Tue, Aug 30, 2016 4:39 PM Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ----- Message from mark.bergman at uphs.upenn.edu on Tue, 30 Aug 2016 17:07:21 -0400 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ----- Message from Lukas Hejtmanek on Wed, 31 Aug 2016 00:02:50 +0200 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ----- Message from Sven Oehme on Tue, 30 Aug 2016 16:24:59 -0700 ----- To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From xhejtman at ics.muni.cz Wed Aug 31 08:03:08 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 09:03:08 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Dominic Mueller-Wicke) In-Reply-To: References: Message-ID: <20160831070308.fiogolgc2nhna6ir@ics.muni.cz> On Wed, Aug 31, 2016 at 07:52:38AM +0200, Dominic Mueller-Wicke01 wrote: > Thanks for reading the paper. I agree that the restore of a large number of > files is a challenge today. The restore is the focus area for future > enhancements for the integration between IBM Spectrum Scale and IBM > Spectrum Protect. If something will be available that helps to improve the > restore capabilities the paper will be updated with this information. I guess that one of the reasons that restore is slow is because this: (strace dsmc) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en/_referencenotitsig", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud/atl_en", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases/stud", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit/atlases", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home/jfeit", F_OK) = -1 ENOENT (No such file or directory) [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum/home", F_OK) = 0 [pid 9022] access("/exports/tape_tape/admin/restored/disk_error/1/VO_metacentrum", F_OK) = 0 it seems that dsmc tests access again and again up to root for each item in the file list if I set different location where to place the restored files. -- Luk?? Hejtm?nek From duersch at us.ibm.com Wed Aug 31 13:45:12 2016 From: duersch at us.ibm.com (Steve Duersch) Date: Wed, 31 Aug 2016 08:45:12 -0400 Subject: [gpfsug-discuss] Maximum value for data replication? In-Reply-To: References: Message-ID: >>Is there a maximum value for data replication in Spectrum Scale? The maximum value for replication is 3. Steve Duersch Spectrum Scale RAID 845-433-7902 IBM Poughkeepsie, New York From: gpfsug-discuss-request at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Date: 08/30/2016 07:25 PM Subject: gpfsug-discuss Digest, Vol 55, Issue 55 Sent by: gpfsug-discuss-bounces at spectrumscale.org Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Maximum value for data replication? (Simon Thompson (Research Computing - IT Services)) 2. greetings (Kevin D Johnson) 3. GPFS 3.5.0 on RHEL 6.8 (Lukas Hejtmanek) 4. Re: GPFS 3.5.0 on RHEL 6.8 (Kevin D Johnson) 5. Re: GPFS 3.5.0 on RHEL 6.8 (mark.bergman at uphs.upenn.edu) 6. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Lukas Hejtmanek) 7. Re: *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" (Sven Oehme) ---------------------------------------------------------------------- Message: 1 Date: Tue, 30 Aug 2016 19:09:05 +0000 From: "Simon Thompson (Research Computing - IT Services)" To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Maximum value for data replication? Message-ID: Content-Type: text/plain; charset="us-ascii" Is there a maximum value for data replication in Spectrum Scale? I have a number of nsd servers which have local storage and Id like each node to have a full copy of all the data in the file-system, say this value is 4, can I set replication to 4 for data and metadata and have each server have a full copy? These are protocol nodes and multi cluster mount another file system (yes I know not supported) and the cesroot is in the remote file system. On several occasions where GPFS has wibbled a bit, this has caused issues with ces locks, so I was thinking of moving the cesroot to a local filesysyem which is replicated on the local ssds in the protocol nodes. I.e. Its a generally quiet file system as its only ces cluster config. I assume if I stop protocols, rsync the data and then change to the new ces root, I should be able to get this working? Thanks Simon ------------------------------ Message: 2 Date: Tue, 30 Aug 2016 19:43:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] greetings Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/5a2e22a3/attachment-0001.html > ------------------------------ Message: 3 Date: Tue, 30 Aug 2016 22:39:18 +0200 From: Lukas Hejtmanek To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <20160830203917.qptfgqvlmdbzu6wr at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the latest patch 32) does start but does not mount and file system. The internal mount cmd gets stucked. -- Luk?? Hejtm?nek ------------------------------ Message: 4 Date: Tue, 30 Aug 2016 20:51:39 +0000 From: "Kevin D Johnson" To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/341d5e11/attachment-0001.html > ------------------------------ Message: 5 Date: Tue, 30 Aug 2016 17:07:21 -0400 From: mark.bergman at uphs.upenn.edu To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 Message-ID: <24437-1472591241.445832 at bR6O.TofS.917u> Content-Type: text/plain; charset="UTF-8" In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, The pithy ruminations from Lukas Hejtmanek on <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: => Hello, GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, but at kernel 2.6.32-573 and lower. I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel revs that caused multipath errors, resulting in GPFS being unable to find all NSDs and mount the filesystem. I am not updating to a newer kernel until I'm certain this is resolved. I opened a bug with CentOS: https://bugs.centos.org/view.php?id=10997 and began an extended discussion with the (RH & SUSE) developers of that chunk of kernel code. I don't know if an upstream bug has been opened by RH, but see: https://patchwork.kernel.org/patch/9140337/ => => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the => latest patch 32) does start but does not mount and file system. The internal => mount cmd gets stucked. => => -- => Luk?? Hejtm?nek -- Mark Bergman voice: 215-746-4061 mark.bergman at uphs.upenn.edu fax: 215-614-0266 http://www.cbica.upenn.edu/ IT Technical Director, Center for Biomedical Image Computing and Analytics Department of Radiology University of Pennsylvania PGP Key: http://www.cbica.upenn.edu/sbia/bergman ------------------------------ Message: 6 Date: Wed, 31 Aug 2016 00:02:50 +0200 From: Lukas Hejtmanek To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: <20160830220250.yt6r7gvfq7rlvtcs at ics.muni.cz> Content-Type: text/plain; charset=iso-8859-2 Hello, On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > Find the paper here: > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection thank you for the paper, I appreciate it. However, I wonder whether it could be extended a little. As it has the title Petascale Data Protection, I think that in Peta scale, you have to deal with millions (well rather hundreds of millions) of files you store in and this is something where TSM does not scale well. Could you give some hints: On the backup site: mmbackup takes ages for: a) scan (try to scan 500M files even in parallel) b) backup - what if 10 % of files get changed - backup process can be blocked several days as mmbackup cannot run in several instances on the same file system, so you have to wait until one run of mmbackup finishes. How long could it take at petascale? On the restore site: how can I restore e.g. 40 millions of file efficiently? dsmc restore '/path/*' runs into serious troubles after say 20M files (maybe wrong internal structures used), however, scanning 1000 more files takes several minutes resulting the dsmc restore never reaches that 40M files. using filelists the situation is even worse. I run dsmc restore -filelist with a filelist consisting of 2.4M files. Running for *two* days without restoring even a single file. dsmc is consuming 100 % CPU. So any hints addressing these issues with really large number of files would be even more appreciated. -- Luk?? Hejtm?nek ------------------------------ Message: 7 Date: Tue, 30 Aug 2016 16:24:59 -0700 From: Sven Oehme To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" Message-ID: Content-Type: text/plain; charset="utf-8" so lets start with some simple questions. when you say mmbackup takes ages, what version of gpfs code are you running ? how do you execute the mmbackup command ? exact parameters would be useful . what HW are you using for the metadata disks ? how much capacity (df -h) and how many inodes (df -i) do you have in the filesystem you try to backup ? sven On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek wrote: > Hello, > > On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote: > > Find the paper here: > > > > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/ > Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection > > thank you for the paper, I appreciate it. > > However, I wonder whether it could be extended a little. As it has the > title > Petascale Data Protection, I think that in Peta scale, you have to deal > with > millions (well rather hundreds of millions) of files you store in and this > is > something where TSM does not scale well. > > Could you give some hints: > > On the backup site: > mmbackup takes ages for: > a) scan (try to scan 500M files even in parallel) > b) backup - what if 10 % of files get changed - backup process can be > blocked > several days as mmbackup cannot run in several instances on the same file > system, so you have to wait until one run of mmbackup finishes. How long > could > it take at petascale? > > On the restore site: > how can I restore e.g. 40 millions of file efficiently? dsmc restore > '/path/*' > runs into serious troubles after say 20M files (maybe wrong internal > structures used), however, scanning 1000 more files takes several minutes > resulting the dsmc restore never reaches that 40M files. > > using filelists the situation is even worse. I run dsmc restore -filelist > with a filelist consisting of 2.4M files. Running for *two* days without > restoring even a single file. dsmc is consuming 100 % CPU. > > So any hints addressing these issues with really large number of files > would > be even more appreciated. > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20160830/d9b3fb68/attachment.html > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 55, Issue 55 ********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From daniel.kidger at uk.ibm.com Wed Aug 31 15:32:11 2016 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Wed, 31 Aug 2016 14:32:11 +0000 Subject: [gpfsug-discuss] Data Replication In-Reply-To: Message-ID: The other 'Exception' is when a rule is used to convert a 1 way replicated file to 2 way, or when only one failure group is up due to HW problems. It that case the (re-replication) is done by whatever nodes are used for the rule or command-line, which may include an NSD server. Daniel IBM Spectrum Storage Software +44 (0)7818 522266 Sent from my iPad using IBM Verse On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: From: mimarsh2 at vt.edu To: gpfsug-discuss at spectrumscale.org Cc: Date: 30 Aug 2016 19:53:31 Subject: Re: [gpfsug-discuss] Data Replication Thanks. This confirms the numbers that I am seeing. Brian On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow wrote: Its the client that does all the synchronous replication, this way the cluster is able to scale as the clients do the leg work (so to speak). The somewhat "exception" is if a GPFS NSD server (or client with direct NSD) access uses a server bases protocol such as SMB, in this case the SMB server will do the replication as the SMB client doesn't know about GPFS or its replication; essentially the SMB server is the GPFS client. -- Lauz On 30 August 2016 17:03:38 CEST, Bryan Banister wrote: The NSD Client handles the replication and will, as you stated, write one copy to one NSD (using the primary server for this NSD) and one to a different NSD in a different GPFS failure group (using quite likely, but not necessarily, a different NSD server that is the primary server for this alternate NSD). Cheers, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Brian Marshall Sent: Tuesday, August 30, 2016 9:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Data Replication All, If I setup a filesystem to have data replication of 2 (2 copies of data), does the data get replicated at the NSD Server or at the client? i.e. Does the client send 2 copies over the network or does the NSD Server get a single copy and then replicate on storage NSDs? I couldn't find a place in the docs that talked about this specific point. Thank you, Brian Marshall Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discussUnless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Wed Aug 31 19:01:45 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Wed, 31 Aug 2016 14:01:45 -0400 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Daniel, So here's my use case: I have a Sandisk IF150 (branded as DeepFlash recently) with 128TB of flash acting as a "fast tier" storage pool in our HPC scratch file system. Can I set the filesystem replication level to 1 then write a policy engine rule to send small and/or recent files to the IF150 with a replication of 2? Any other comments on the proposed usage strategy are helpful. Thank you, Brian Marshall On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger wrote: > The other 'Exception' is when a rule is used to convert a 1 way replicated > file to 2 way, or when only one failure group is up due to HW problems. It > that case the (re-replication) is done by whatever nodes are used for the > rule or command-line, which may include an NSD server. > > Daniel > > IBM Spectrum Storage Software > +44 (0)7818 522266 <+44%207818%20522266> > Sent from my iPad using IBM Verse > > > ------------------------------ > On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: > > From: mimarsh2 at vt.edu > To: gpfsug-discuss at spectrumscale.org > Cc: > Date: 30 Aug 2016 19:53:31 > Subject: Re: [gpfsug-discuss] Data Replication > > > Thanks. This confirms the numbers that I am seeing. > > Brian > > On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < > laurence at qsplace.co.uk> wrote: > >> Its the client that does all the synchronous replication, this way the >> cluster is able to scale as the clients do the leg work (so to speak). >> >> The somewhat "exception" is if a GPFS NSD server (or client with direct >> NSD) access uses a server bases protocol such as SMB, in this case the SMB >> server will do the replication as the SMB client doesn't know about GPFS or >> its replication; essentially the SMB server is the GPFS client. >> >> -- Lauz >> >> On 30 August 2016 17:03:38 CEST, Bryan Banister < >> bbanister at jumptrading.com> wrote: >> >>> The NSD Client handles the replication and will, as you stated, write >>> one copy to one NSD (using the primary server for this NSD) and one to a >>> different NSD in a different GPFS failure group (using quite likely, but >>> not necessarily, a different NSD server that is the primary server for this >>> alternate NSD). >>> >>> Cheers, >>> >>> -Bryan >>> >>> >>> >>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>> *To:* gpfsug main discussion list >>> *Subject:* [gpfsug-discuss] Data Replication >>> >>> >>> >>> All, >>> >>> >>> >>> If I setup a filesystem to have data replication of 2 (2 copies of >>> data), does the data get replicated at the NSD Server or at the client? >>> i.e. Does the client send 2 copies over the network or does the NSD Server >>> get a single copy and then replicate on storage NSDs? >>> >>> >>> >>> I couldn't find a place in the docs that talked about this specific >>> point. >>> >>> >>> >>> Thank you, >>> >>> Brian Marshall >>> >>> >>> ------------------------------ >>> >>> Note: This email is for the confidential use of the named addressee(s) >>> only and may contain proprietary, confidential or privileged information. >>> If you are not the intended recipient, you are hereby notified that any >>> review, dissemination or copying of this email is strictly prohibited, and >>> to please notify the sender immediately and destroy this email and any >>> attachments. Email transmission cannot be guaranteed to be secure or >>> error-free. The Company, therefore, does not make any guarantees as to the >>> completeness or accuracy of this email or any attachments. This email is >>> for informational purposes only and does not constitute a recommendation, >>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>> or perform any type of transaction of a financial product. >>> >>> ------------------------------ >>> >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Aug 31 19:10:07 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 31 Aug 2016 14:10:07 -0400 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" - how about a Billion files in 140 seconds? In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: When you write something like "mmbackup takes ages" - that let's us know how you feel, kinda. But we need some facts and data to make a determination if there is a real problem and whether and how it might be improved. Just to do a "back of the envelope" estimate of how long backup operations "ought to" take - we'd need to know how many disks and/or SSDs with what performance characteristics, how many nodes withf what performance characteristics, network "fabric(s)", Number of files to be scanned, Average number of files per directory, GPFS blocksize(s) configured, Backup devices available with speeds and feeds, etc, etc. But anyway just to throw ballpark numbers "out there" to give you an idea of what is possible. I can tell you that a 20 months ago Sven and I benchmarked mmapplypolicy scanning 983 Million files in 136 seconds! The command looked like this: mmapplypolicy /ibm/fs2-1m-p01/shared/Btt -g /ibm/fs2-1m-p01/tmp -d 7 -A 256 -a 32 -n 8 -P /ghome/makaplan/sventests/milli.policy -I test -L 1 -N fastclients fastclients was 10 X86_64 commodity nodes The fs2-1m-p01 file system was hosted on just two IBM GSS nodes and everything was on an Infiniband switch. We packed about 7000 files into each directory.... (This admittedly may not be typical...) This is NOT to say you could back up that many files that fast, but Spectrum Scale metadata scanning can be fast, even with relatively modest hardware resources. YMMV ;-) Marc of GPFS -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Wed Aug 31 19:39:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 20:39:26 +0200 Subject: [gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" In-Reply-To: References: <20160830220250.yt6r7gvfq7rlvtcs@ics.muni.cz> Message-ID: <20160831183926.k4mbwbbrmxybd7a3@ics.muni.cz> On Tue, Aug 30, 2016 at 04:24:59PM -0700, Sven Oehme wrote: > so lets start with some simple questions. > > when you say mmbackup takes ages, what version of gpfs code are you running > ? that was GPFS 3.5.0-8. The mmapplypolicy took over 2 hours but that was the least problem. We developed our own set of backups scripts around mmbackup to address these issues: 1) while mmbackup is running, you cannot run another instance on the same file system. 2) mmbackup can be very slow, but not mmbackup itself but consecutive dsmc selective, sorry for being misleading, but mainly due to the large number of files to be backed up 3) related to the previous, mmbackup scripts seem to be executing a 'grep' cmd for every input file to check whether it has entry in dmsc output log. well guess what happens if you have millions of files at the input and several gigabytes in dsmc outpu log... In our case, the grep storm took several *weeks*. 4) very surprisingly, some of the files were not backed up at all. We cannot find why but dsmc incremental found some old files that were not covered by mmbackup backups. Maybe because the mmbackup process was not gracefully terminated in some cases (node crash) and so on. > how do you execute the mmbackup command ? exact parameters would be useful > . /usr/lpp/mmfs/bin/mmbackup tape_tape -t incremental -v -N fe1 -P ${POLICY_FILE} --tsm-servers SERVER1 -g /gpfs/clusterbase/tmp/ -s /tmp -m 4 -B 9999999999999 -L 0 we had external exec script that split files from policy into chunks that were run in parallel. > what HW are you using for the metadata disks ? 4x SSD > how much capacity (df -h) and how many inodes (df -i) do you have in the > filesystem you try to backup ? df -h /dev/tape_tape 1.5P 745T 711T 52% /exports/tape_tape df -hi /dev/tape_tape 1.3G 98M 1.2G 8% /exports/tape_tape (98M inodes used) mmdf tape_tape disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 175 TB) nsd_t1_5 23437934592 1 No Yes 7342735360 ( 31%) 133872128 ( 1%) nsd_t1_6 23437934592 1 No Yes 7341166592 ( 31%) 133918784 ( 1%) nsd_t1b_2 23437934592 1 No Yes 7343919104 ( 31%) 134165056 ( 1%) nsd_t1b_3 23437934592 1 No Yes 7341283328 ( 31%) 133986560 ( 1%) nsd_ssd_4 770703360 2 Yes No 692172800 ( 90%) 15981952 ( 2%) nsd_ssd_3 770703360 2 Yes No 692252672 ( 90%) 15921856 ( 2%) nsd_ssd_2 770703360 2 Yes No 692189184 ( 90%) 15928832 ( 2%) nsd_ssd_1 770703360 2 Yes No 692197376 ( 90%) 16013248 ( 2%) ------------- -------------------- ------------------- (pool total) 96834551808 32137916416 ( 33%) 599788416 ( 1%) Disks in storage pool: maid (Maximum disk size allowed is 466 TB) nsd8_t2_12 31249989632 1 No Yes 13167828992 ( 42%) 36282048 ( 0%) nsd8_t2_13 31249989632 1 No Yes 13166729216 ( 42%) 36131072 ( 0%) nsd8_t2_14 31249989632 1 No Yes 13166886912 ( 42%) 36371072 ( 0%) nsd8_t2_15 31249989632 1 No Yes 13168209920 ( 42%) 36681728 ( 0%) nsd8_t2_16 31249989632 1 No Yes 13165176832 ( 42%) 36279488 ( 0%) nsd8_t2_17 31249989632 1 No Yes 13159870464 ( 42%) 36002560 ( 0%) nsd8_t2_46 31249989632 1 No Yes 29624694784 ( 95%) 81600 ( 0%) nsd8_t2_45 31249989632 1 No Yes 29623111680 ( 95%) 77184 ( 0%) nsd8_t2_44 31249989632 1 No Yes 29621467136 ( 95%) 61440 ( 0%) nsd8_t2_43 31249989632 1 No Yes 29622964224 ( 95%) 64640 ( 0%) nsd8_t2_18 31249989632 1 No Yes 13166675968 ( 42%) 36147648 ( 0%) nsd8_t2_19 31249989632 1 No Yes 13164529664 ( 42%) 36225216 ( 0%) nsd8_t2_20 31249989632 1 No Yes 13165223936 ( 42%) 36242368 ( 0%) nsd8_t2_21 31249989632 1 No Yes 13167353856 ( 42%) 36007744 ( 0%) nsd8_t2_31 31249989632 1 No Yes 13116979200 ( 42%) 14155200 ( 0%) nsd8_t2_32 31249989632 1 No Yes 13115633664 ( 42%) 14243840 ( 0%) nsd8_t2_33 31249989632 1 No Yes 13115830272 ( 42%) 14235392 ( 0%) nsd8_t2_34 31249989632 1 No Yes 13119727616 ( 42%) 14500608 ( 0%) nsd8_t2_35 31249989632 1 No Yes 13116925952 ( 42%) 14304192 ( 0%) nsd8_t2_0 31249989632 1 No Yes 13145503744 ( 42%) 99222016 ( 0%) nsd8_t2_36 31249989632 1 No Yes 13119858688 ( 42%) 14054784 ( 0%) nsd8_t2_37 31249989632 1 No Yes 13114101760 ( 42%) 14200704 ( 0%) nsd8_t2_38 31249989632 1 No Yes 13116483584 ( 42%) 14174720 ( 0%) nsd8_t2_39 31249989632 1 No Yes 13121257472 ( 42%) 14094720 ( 0%) nsd8_t2_40 31249989632 1 No Yes 29622908928 ( 95%) 84352 ( 0%) nsd8_t2_1 31249989632 1 No Yes 13146089472 ( 42%) 99566784 ( 0%) nsd8_t2_2 31249989632 1 No Yes 13146208256 ( 42%) 99128960 ( 0%) nsd8_t2_3 31249989632 1 No Yes 13146890240 ( 42%) 99766720 ( 0%) nsd8_t2_4 31249989632 1 No Yes 13145143296 ( 42%) 98992576 ( 0%) nsd8_t2_5 31249989632 1 No Yes 13135876096 ( 42%) 99555008 ( 0%) nsd8_t2_6 31249989632 1 No Yes 13142831104 ( 42%) 99728064 ( 0%) nsd8_t2_7 31249989632 1 No Yes 13140283392 ( 42%) 99412480 ( 0%) nsd8_t2_8 31249989632 1 No Yes 13143470080 ( 42%) 99653696 ( 0%) nsd8_t2_9 31249989632 1 No Yes 13143650304 ( 42%) 99224704 ( 0%) nsd8_t2_10 31249989632 1 No Yes 13145440256 ( 42%) 99238528 ( 0%) nsd8_t2_11 31249989632 1 No Yes 13143201792 ( 42%) 99283008 ( 0%) nsd8_t2_22 31249989632 1 No Yes 13171724288 ( 42%) 36040704 ( 0%) nsd8_t2_23 31249989632 1 No Yes 13166782464 ( 42%) 36212416 ( 0%) nsd8_t2_24 31249989632 1 No Yes 13167990784 ( 42%) 35842368 ( 0%) nsd8_t2_25 31249989632 1 No Yes 13166972928 ( 42%) 36086848 ( 0%) nsd8_t2_26 31249989632 1 No Yes 13167495168 ( 42%) 36114496 ( 0%) nsd8_t2_27 31249989632 1 No Yes 13164419072 ( 42%) 36119680 ( 0%) nsd8_t2_28 31249989632 1 No Yes 13167804416 ( 42%) 36088832 ( 0%) nsd8_t2_29 31249989632 1 No Yes 13166057472 ( 42%) 36107072 ( 0%) nsd8_t2_30 31249989632 1 No Yes 13163673600 ( 42%) 36102528 ( 0%) nsd8_t2_41 31249989632 1 No Yes 29620840448 ( 95%) 70208 ( 0%) nsd8_t2_42 31249989632 1 No Yes 29621110784 ( 95%) 69568 ( 0%) ------------- -------------------- ------------------- (pool total) 1468749512704 733299890176 ( 50%) 2008331584 ( 0%) ============= ==================== =================== (data) 1562501251072 762668994560 ( 49%) 2544274112 ( 0%) (metadata) 3082813440 2768812032 ( 90%) 63845888 ( 2%) ============= ==================== =================== (total) 1565584064512 765437806592 ( 49%) 2608120000 ( 0%) Inode Information ----------------- Number of used inodes: 102026081 Number of free inodes: 72791199 Number of allocated inodes: 174817280 Maximum number of inodes: 1342177280 -- Luk?? Hejtm?nek From xhejtman at ics.muni.cz Wed Aug 31 20:26:26 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Wed, 31 Aug 2016 21:26:26 +0200 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <24437-1472591241.445832@bR6O.TofS.917u> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> Message-ID: <20160831192626.k4em4iz7ne2e2cmg@ics.muni.cz> Hello, thank you for explanation. I confirm that things are working with 573 kernel. On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: > In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, > The pithy ruminations from Lukas Hejtmanek on > <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: > => Hello, > > GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, > but at kernel 2.6.32-573 and lower. > > I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel > revs that caused multipath errors, resulting in GPFS being unable to > find all NSDs and mount the filesystem. > > I am not updating to a newer kernel until I'm certain this is resolved. > > I opened a bug with CentOS: > > https://bugs.centos.org/view.php?id=10997 > > and began an extended discussion with the (RH & SUSE) developers of that > chunk of kernel code. I don't know if an upstream bug has been opened > by RH, but see: > > https://patchwork.kernel.org/patch/9140337/ > => > => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the > => latest patch 32) does start but does not mount and file system. The internal > => mount cmd gets stucked. > => > => -- > => Luk?? Hejtm?nek > > > -- > Mark Bergman voice: 215-746-4061 > mark.bergman at uphs.upenn.edu fax: 215-614-0266 > http://www.cbica.upenn.edu/ > IT Technical Director, Center for Biomedical Image Computing and Analytics > Department of Radiology University of Pennsylvania > PGP Key: http://www.cbica.upenn.edu/sbia/bergman > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Luk?? Hejtm?nek From wilshire at mcs.anl.gov Wed Aug 31 20:39:17 2016 From: wilshire at mcs.anl.gov (John Blaas) Date: Wed, 31 Aug 2016 14:39:17 -0500 Subject: [gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8 In-Reply-To: <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> References: <20160830203917.qptfgqvlmdbzu6wr@ics.muni.cz> <24437-1472591241.445832@bR6O.TofS.917u> <4e7507130c674e35a7ac2c3fa16359e1@GEORGE.anl.gov> Message-ID: We are running 3.5 w/ patch 32 on nodes with the storage cluster running on Centos 6.8 with kernel at 2.6.32-642.1.1 and the remote compute cluster running 2.6.32-642.3.1 without any issues. That being said we are looking to upgrade as soon as possible to 4.1, but thought I would add that it is possible even if not supported. --- John Blaas On Wed, Aug 31, 2016 at 2:26 PM, Lukas Hejtmanek wrote: > Hello, > > thank you for explanation. I confirm that things are working with 573 kernel. > > On Tue, Aug 30, 2016 at 05:07:21PM -0400, mark.bergman at uphs.upenn.edu wrote: >> In the message dated: Tue, 30 Aug 2016 22:39:18 +0200, >> The pithy ruminations from Lukas Hejtmanek on >> <[gpfsug-discuss] GPFS 3.5.0 on RHEL 6.8> were: >> => Hello, >> >> GPFS 3.5.0.[23..3-0] work for me under [CentOS|ScientificLinux] 6.8, >> but at kernel 2.6.32-573 and lower. >> >> I've found kernel bugs in blk_cloned_rq_check_limits() in later kernel >> revs that caused multipath errors, resulting in GPFS being unable to >> find all NSDs and mount the filesystem. >> >> I am not updating to a newer kernel until I'm certain this is resolved. >> >> I opened a bug with CentOS: >> >> https://bugs.centos.org/view.php?id=10997 >> >> and began an extended discussion with the (RH & SUSE) developers of that >> chunk of kernel code. I don't know if an upstream bug has been opened >> by RH, but see: >> >> https://patchwork.kernel.org/patch/9140337/ >> => >> => does it work for anyone? As of kernel 2.6.32-642, GPFS 3.5.0 (including the >> => latest patch 32) does start but does not mount and file system. The internal >> => mount cmd gets stucked. >> => >> => -- >> => Luk?? Hejtm?nek >> >> >> -- >> Mark Bergman voice: 215-746-4061 >> mark.bergman at uphs.upenn.edu fax: 215-614-0266 >> http://www.cbica.upenn.edu/ >> IT Technical Director, Center for Biomedical Image Computing and Analytics >> Department of Radiology University of Pennsylvania >> PGP Key: http://www.cbica.upenn.edu/sbia/bergman >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -- > Luk?? Hejtm?nek > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From janfrode at tanso.net Wed Aug 31 21:44:04 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 31 Aug 2016 22:44:04 +0200 Subject: [gpfsug-discuss] Data Replication In-Reply-To: References: Message-ID: Assuming your DeepFlash pool is named "deep", something like the following should work: RULE 'deepreplicate' migrate from pool 'deep' to pool 'deep' replicate(2) where MISC_ATTRIBUTES NOT LIKE '%2%' and POOL_NAME LIKE 'deep' "mmapplypolicy gpfs0 -P replicate-policy.pol -I yes" and possibly "mmrestripefs gpfs0 -r" afterwards. -jf On Wed, Aug 31, 2016 at 8:01 PM, Brian Marshall wrote: > Daniel, > > So here's my use case: I have a Sandisk IF150 (branded as DeepFlash > recently) with 128TB of flash acting as a "fast tier" storage pool in our > HPC scratch file system. Can I set the filesystem replication level to 1 > then write a policy engine rule to send small and/or recent files to the > IF150 with a replication of 2? > > Any other comments on the proposed usage strategy are helpful. > > Thank you, > Brian Marshall > > On Wed, Aug 31, 2016 at 10:32 AM, Daniel Kidger > wrote: > >> The other 'Exception' is when a rule is used to convert a 1 way >> replicated file to 2 way, or when only one failure group is up due to HW >> problems. It that case the (re-replication) is done by whatever nodes are >> used for the rule or command-line, which may include an NSD server. >> >> Daniel >> >> IBM Spectrum Storage Software >> +44 (0)7818 522266 <+44%207818%20522266> >> Sent from my iPad using IBM Verse >> >> >> ------------------------------ >> On 30 Aug 2016, 19:53:31, mimarsh2 at vt.edu wrote: >> >> From: mimarsh2 at vt.edu >> To: gpfsug-discuss at spectrumscale.org >> Cc: >> Date: 30 Aug 2016 19:53:31 >> Subject: Re: [gpfsug-discuss] Data Replication >> >> >> Thanks. This confirms the numbers that I am seeing. >> >> Brian >> >> On Tue, Aug 30, 2016 at 2:50 PM, Laurence Horrocks-Barlow < >> laurence at qsplace.co.uk> wrote: >> >>> Its the client that does all the synchronous replication, this way the >>> cluster is able to scale as the clients do the leg work (so to speak). >>> >>> The somewhat "exception" is if a GPFS NSD server (or client with direct >>> NSD) access uses a server bases protocol such as SMB, in this case the SMB >>> server will do the replication as the SMB client doesn't know about GPFS or >>> its replication; essentially the SMB server is the GPFS client. >>> >>> -- Lauz >>> >>> On 30 August 2016 17:03:38 CEST, Bryan Banister < >>> bbanister at jumptrading.com> wrote: >>> >>>> The NSD Client handles the replication and will, as you stated, write >>>> one copy to one NSD (using the primary server for this NSD) and one to a >>>> different NSD in a different GPFS failure group (using quite likely, but >>>> not necessarily, a different NSD server that is the primary server for this >>>> alternate NSD). >>>> >>>> Cheers, >>>> >>>> -Bryan >>>> >>>> >>>> >>>> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto: >>>> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Brian Marshall >>>> *Sent:* Tuesday, August 30, 2016 9:59 AM >>>> *To:* gpfsug main discussion list >>>> *Subject:* [gpfsug-discuss] Data Replication >>>> >>>> >>>> >>>> All, >>>> >>>> >>>> >>>> If I setup a filesystem to have data replication of 2 (2 copies of >>>> data), does the data get replicated at the NSD Server or at the client? >>>> i.e. Does the client send 2 copies over the network or does the NSD Server >>>> get a single copy and then replicate on storage NSDs? >>>> >>>> >>>> >>>> I couldn't find a place in the docs that talked about this specific >>>> point. >>>> >>>> >>>> >>>> Thank you, >>>> >>>> Brian Marshall >>>> >>>> >>>> ------------------------------ >>>> >>>> Note: This email is for the confidential use of the named addressee(s) >>>> only and may contain proprietary, confidential or privileged information. >>>> If you are not the intended recipient, you are hereby notified that any >>>> review, dissemination or copying of this email is strictly prohibited, and >>>> to please notify the sender immediately and destroy this email and any >>>> attachments. Email transmission cannot be guaranteed to be secure or >>>> error-free. The Company, therefore, does not make any guarantees as to the >>>> completeness or accuracy of this email or any attachments. This email is >>>> for informational purposes only and does not constitute a recommendation, >>>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>>> or perform any type of transaction of a financial product. >>>> >>>> ------------------------------ >>>> >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >>> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: